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ABSTRACT 


An  imporlant  and  neglected  program  property  is  that  the  program  executes  cleanly,  that  is,  without 
execution  errors  such  as  dividing  by  zero,  referencing  an  uninitialized  variable,  and  aliasing.  The 
language  designer  has  lacked  a  precise  way  of  specifying  when  execution  errors  occur,  and  the  language 
user  has  lacked  good  means  of  proving  they  will  not  occur. 

This  thesis  attacks  these  two  closely  related  problems,  providing  a  method  for  formally  defining 
conditions  that  ensure  the  absence  of  execution  errors.  The  method  uses  rewrite  rules  based  on  a 
grammar  for  the  programming  language  in  question.  Using  the  method,  a  suificient  condition  to  ensure 
the  absence  of  execution  errors  can  be  defined  for  each  construct  in  the  language.  Applying  the  rewrite 
rules  to  a  program  segment  yields  the  corresponding  clean  execution  condition. 

Using  rewrite  rules  alone,  clean  execution  conditions  cannot  be  defined  for  statement  sequences, 
declarations  composed  with  their  associated  scopes,  and  parameterized  declarations  (procedure  and 
function  declarations,  loops,  and  parameterized  types).  The  rewrite  rule  approach  is  extended  to  handle 
these  constructs  using  predicate  transformers,  substitution  functions,  and  inference  rules  respectively. 
The  method  is  illustrated  on  a  Pascal-like  language;  The  clean  execution  of  much  of  the  programming 
language  Euclid  is  defined  in  an  appendix. 

This  thesis  provides  the  first  method  for  formally  defining  clean  execution  conditions,  a  notation 
for  documenting  design  decisions  about  execution  constraints,  an  independent  discipline  for  considering 
clean  execution  during  language  design,  and  better  tools  for  verifying  clean  execution. 
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1.  CLEAN  EXECUTION,  A  NEGLECTED  PROGRAM  PROPERTY 


Proving  that  a  program  does  its  intended  job,  a  critical  part  of  software  development,  is  not  a  single 
task,  but  requires  proving  a  number  of  properties.  Properties  often  proved  are  that  the  program  is  syn¬ 
tactically  correct,  that  it  is  consistent  with  its  specification,  and  that  it  halts. 

A  larger  class  of  programs,  however,  can  be  proved  syntactically  correct  than  can  be  proved  con¬ 
sistent  with  their  specifications.  The  gap  between  the  two  classes  corresponds  to  an  important  and 
neglected  program  property,  that  the  program  executes  without  execution  errors  such  as  dividing  by 
zero,  referencing  an  uninitialized  variable,  or  aliasing. 

This  gap  is  important  to  both  the  language  designer  and  the  language  user.  The  language  designer 
must  ensure  that  he  has  specified  all  execution  errors  that  can  occur,  and  the  programmer  must  ensure 
that  none  of  these  will  occur  in  his  program.  However,  the  language  designer  has  lacked  a  precise  way 
of  specifying  which  execution  errors  can  occur,  and  the  language  user  good  means  of  proving  they  will 
not  occur,  i 

This  thesis  attacks  these  two  closely  related  problems.  In  this  chapter  we  define  the  problem  area 
and  discuss  the  problems  from  both  the  programmer’s  and  the  language  designer’s  points  of  view.  We 
then  outline  a  plan  of  attack,  summarize  the  contributions  of  this  research,  and  describe  the  organiza¬ 
tion  of  the  rest  of  the  dissertation. 

1.1  DEFINITION  OF  PROBLEM  AREA 

There  are  two  kinds  of  specifications  (sets  of  properties)  with  which  a  program  must  be  consistent: 
program  specifications  and  programming  language  definitions.  If  a  program  is  consistent  with  its  pro¬ 
gram  specification,  we  shall  say  that  it  is  correct.  If  a  program  is  consistent  with  the  programming 
language  definition,  we  shall  say  that  it  is  legal.  The  traditional  term  "correct"  is  somewhat  misleading 
since  a  program  may  be  consistent  with  one  program  specification  and  not  with  another.  However, 
based  on  a  given  language  specification,  every  program  is  either  legal  or  not,  independent  of  its  pro¬ 
gram  specification.  An  illegal  program  has  no  meaning,  and  thus  is  neither  correct  nor  incorrect. 

What  constitutes  a  legal  program  is  defined  by  constraints  on  the  program.  For  our  purposes,  there 
are  two  kinds  of  constraints,  statically  enforceable  constraints  and  execution  constraints.  A  constraint 
whose  enforcement  may  depend  on  an  input  value  is  an  execution  constraint.  A  constraint  that  cannot 
is  statically  enforceable.  Following  common  usage,  we  shall  sometimes  refer  to  statically  enforceable 
constraints  as  syntax.  In  Chapter  2  we  survey  execution  constraints  in  existing  languages. 

The  violation  of  an  execution  constraint  is  an  execution  error.  We  shall  say  that  a  program  executes 
cleanly  if  none  of  its  executions  lead  to  execution  errors. 

1.2  SEPARATING  PROGRAMMER  CONCERNS 

Traditionally,  it  has  been  the  programmer’s  responsibility  to  prove  that  his  program  meets  its 
specification  and  the  language  implementation’s  responsibility  to  prove  that  it  meets  statically  enforce¬ 
able  constraints.  Clean  execution,  however,  is  seldom  proven. 

This  can  be  partially  attributed  to  the  undecidability  of  the  task.  Since  a  statement  can  result  in  an 
execution  error  only  if  it  is  executed,  any  algorithm  that  could  decide  the  absence  of  a  particular  class 
of  execution  errors  could  be  used  to  decide  whether  a  program  halts.  Indeed,  a  requirement  that  all 
programs  halt  would  be  an  execution  constraint. 
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The  situation,  however,  is  not  explained  by  undecidability  because  proofs  of  correctness  face  pre¬ 
cisely  the  same  problem.  Rather,  clean  execution  has  fallen  into  the  crack  between  syntactic  correct¬ 
ness  and  correctness.  Since  execution  constraints  are  not  statically  enforceable,  they  have  seldom  been 
considered  the  responsibility  of  the  language  implementation.  Since  they  are  not  an  explicit  part  of  the 
program  specification,  they  have  even  less  often  been  considered  the  verifier’s  responsibility.  More¬ 
over,  including  them  would  vastly  increase  the  number  of  trivial  propositions  for  the  verifier  to  prove. 

A  legal  program,  that  is,  one  whose  meaning  could  be  considered,  has  been  one  that  satisfied  stati¬ 
cally  enforceable  constraints.  If  we  are  to  keep  clean  execution  from  falling  into  the  crack  between  syn¬ 
tactic  correctness  and  correctness,  we  must  recognize  it  as  part  of  legality. 

This  simplifies  the  programmer’s  job  in  two  ways.  First,  separating  clean  execution  issues  from 
correctness  issues  makes  both  easier  to  address.  Second,  as  with  all  language  constraints,  the  language 
implementation  can  then  enforce  execution  constraints.  For  example,  a  variable  declaration  restricts 
the  values  that  can  be  assigned  to  the  variable  and  the  operators  that  can  manipulate  the  variable.  The 
programmer  intends  to  use  the  variable  in  only  legal  ways  and  should  be  able  to  rely  on  the  implemen¬ 
tation  to  enforce  these  restrictions. 

1.3  A  FAILING  OF  EXISTING  LANGUAGE  DEFINITIONS 

Whether  recognized  as  part  of  legality  or  not,  clean  execution  has  generally  been  an  afterthought  in 
informal  as  well  as  formal  language  definitions.  Attention  has  been  focused  instead  on  the  meaning  of 
what  should  occur. 

It  is  a  tenet  of  this  research  that  a  language  definition  must  also  specify  what  should  not  occur. 
Before  the  programmer  can  prove  that  execution  errors  do  not  occur,  the  language  designer  must  define 
conditions  that  ensure  they  will  not. 

By  this  criterion,  existing  language  definitions  are  inadequate.  In  particular,  they  do  not  specify  for 
each  construct  of  the  language  a  sufficient  condition  to  ensure  the  absence  of  execution  errors.  We 
shall  call  such  a  sufficient  condition  a  clean  execution  condition.  (The  single  exception  is  the  program¬ 
ming  language  Euclid  [Lampson  et  al.  77],  which  we  discuss  in  Chapter  3  and  which  does  not  define 
clean  execution  conditions  precisely.)  For  example,  let  defined(x)  be  the  condition  that  the  variable  "x" 
has  been  assigned  a  value.  A  language  definition  might  specify  that  the  clean  execution  condition  for  a 
reference  to  "x"  rs  defined(x),  and  that  for  the  integer  expression  x/y  it  is 

defined(x)  &  defined(y)  &  y?^0 


A  clean  execution  condition  is  a  precondition;  if  it  is  satisfied,  then  the  associated  construct  can  be 
executed  cleanly.  In  general  it  deals  only  tangentially  with  what  the  associated  construct  means.  If,  for 
example,  the  language  included  a  mod  operator,  x  mody  might  have  the  same  clean  execution  condition 
as  x/y.  What  a  clean  execution  condition  ensures  is  that  whatever  the  associated  construct  does,  it  does 
it  cleanly,  without  execution  errors. 

One  might  expect  that  as  programming  languages  have  evolved,  language  designers  would  better 
understand  the  problems  of  execution  errors  and  better  know  how  to  deal  with  them.  However,  quite 
the  opposite  is  true.  Features  such  as  parameterized  types,  modules,  more  sophisticated  type  compati¬ 
bility  rules,  import  and  export  constraints,  and  aliasing  restrictions  often  introduce  execution  errors. 
Such  features  also  interact  with  other  language  features,  producing  further  execution  errors.  Moreover, 
language  designers  have  chosen  in  some  cases  to  introduce  the  possibility  of  execution  errors,  such  as 
by  forbidding  aliasing,  precisely  to  restrict  the  class  of  legal  programs  to  ones  that  they  knew  better  how 
to  reason  about.  By  making  it  harder  to  prove  programs  legal,  they  hoped  to  make  it  easier  to  prove 
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•legal  programs  correct. 

1.4  FOCUS  ON  DERIVING  CLEAN  EXECUTION  CONDITIONS 

There  are  two  parts  to  proving  a  program  property:  expressing  the  property  and  verifying  that  the 
property  is  always  satisfied.  Thus,  proving  that  a  construct  executes  cleanly  requires  first  deriving  and 
then  verifying  a  clean  execution  condition  for  the  construct. 

Unlike  previous  work  on  proving  the  absence  of  execution  errors,  which  we  survey  in  Chapter  3. 
this  thesis  focuses  on  deriving  clean  execution  conditions.  There  are  several  reasons  for  this  focus: 

First,  existing  ways  of  defining  programming  languages  and  hence  of  deriving  clean  execution  con¬ 
ditions  are  inadequate,  as  discussed  above. 

Second,  as  we  shall  see,  clean  execution  can  be  expressed  in  essentially  the  same  terms  as  correct¬ 
ness.  Thus,  having  derived  a  clean  execution  condition,  we  can  rely  largely  on  existing  program 
verification  technology  to  verify  it. 

Third,  thus  far  we  have  considered  the  programming  language  in  question  as  a  finished  product. 
Language  designers,  however,  need  a  discipline  for  considering  clean  execution  issues  during  language 
design.  Interactions  of  language  features  are  a  particularly  nasty  source  of  execution  errors  and  one 
that  past  language  designers  have  had  little  help  in  confronting.  As  in  defining  proof  rules,  constructing 
clean  execution  rules  is  a  useful  discipline  that  forces  the  language  designer  to  be  explicit  about  interac¬ 
tions  of  language  features. 

1.5  PLAN  OF  ATTACK 

Our  goal  is  a  precise  way  of  defining  clean  execution  that  enables  us  to  derive  clean  execution  con¬ 
ditions.  To  ensure  that  interactions  of  language  features  are  properly  taken  into  account,  a  rigorous 
definition  is  essential.  As  has  been  argued  in  defining  proof  rules  [Hoare  and  Wirth  73],  (1)  a  rigorous 
definition  can  serve  as  a  reference  for  the  language,  (2)  attempting  to  define  a  language  rigorously  can 
reveal  irregularities  of  structure,  and  (3)  the  resulting  definition  can  be  used  in  proving  program  pro¬ 
perties. 

The  basic  approach  is  as  follows.  We  shall  view  the  programming  language  in  question  as  an 
expression  language.  An  operation  can  be  executed  cleanly  if  its  operands  are  legal  and  if  the  operands 
can  be  combined  legally  by  the  operator  involved.  To  generate  clean  execution  conditions,  we  shall  use 
rewrite  rules  based  on  a  grammar  for  the  programming  language.  The  rewrite  rules  provide  both  a 
definition  of  clean  execution  and  an  algorithm  for  translating  a  programming  language  expression  to  a 
clean  execution  condition. 

t 

Ideally,  we  would  generate  all  clean  execution  conditions  with  rewrite  rules  alone.  This  purely  syn¬ 
tactic  approach,  however,  cannot  handle  statement  composition  (side-effects)  or  routines  and  loops,  and 
can  handle  declarations  composed  with  their  scopes  only  tediously.  We  shall  extend  the  rewrite  rule 
approach,  adding  predicate  transformers  to  handle  statement  composition,  pre-  and  postconditions  for 
routines  and  loops,  and  substitution  functions  for  declarations. 

It  is  instructive  to  contrast  how  expressions,  declarations,  and  statements  are  dealt  with  in  proving 
correctness  and  in  proving  clean  execution.  If  there  are  no  side-effects  in  functions,  expressions  can  be 
treated  as  purely  unstructured,  algebraic  entities  in  proving  correctness  properties.  Clean  execution, 
however,  depends  critically  on  the  structure  of  expressions.  Similarly,  declarations  are  more  important 
in  proving  clean  execution.  The  redundancy  they  provide  is  used  primarily  in  checking  that  a  program 
is  legal.  There  are  relatively  few  execution  constraints  on  statements,  which  are  the  focus  in  proving 


correctness  properties.  Statements  are  primarily  a  clean  execution  concern  where  the  strict  distinction 
between  clean  execution  and  correctness  breaks  down:  statement  composition  and  routines  and  loops. 
Here,  the  clean  execution  condition  cannot  be  generated  purely  from  the  program  text. 

1.6  CONTRIBUTIONS  OF  THIS  RESEARCH 

This  thesis  investigates  the  role  clean  execution  conditions  play  in  language  definition.  The  major 
contribution  of  this  work  is  a  language  definition  tool:  a  method  for  formally  defining  clean  execution 
conditions.  Using  this  method,  we  can  specify  sufficient  conditions  to  ensure  clean  execution  for  all  the 
constructs  of  a  language.  This  thesis  provides 

(1)  the  first  tool  for  formally  defining  clean  execution  conditions. 

(2)  a  notation  for  documenting  design  decisions  about  execution  constraints,  resulting  in  a 
more  complete  language  definition. 

(3)  an  independent  discipline  for  considering  clean  execution  during  language  design.  The 
exercise  itself,  as  well  as  the  result  of  the  exercise,  is  important. 

(4)  better  tools  for  verifying  clean  execution.  Moreover,  even  if  a  verification  fails,  it  can 
expose  program  faults,  such  as  implicit  assumptions  about  how  a  routine  is  used  or  degen¬ 
erate  cases  of  data  structures  or  loops. 

(5)  clean  execution  conditions  for  much  of  Euclid,  thus  testing  the  method  developed. 

1.7  ORGANIZATION  OF  THIS  DISSERTATION 

In  the  first  three  chapters  we  introduce  and  define  the  problem.  In  Chapter  2  we  survey  sources  of 
execution  errors  in  existing  languages.  In  Chapter  3  we  survey  previous  work  on  proving  the  absence 
of  execution  errors  and  contrast  this  approach  to  enforcing  execution  constraints  with  run-time  check¬ 
ing. 


In  Chapters  4  through  8  we  describe  the  rewrite  rule  approach  that  is  at  the  heart  of  this  thesis.  In 
Chapter  4  we  define  an  assertion  language,  review  the  basic  concepts  of  rewrite  rules,  and  use  rewrite 
rules  to  define  the  clean  execution  of  a  simple  arithmetic  expression  language.  In  Chapters  5,  6.  and  7 
we  define  the  clean  execution  of  an  example  language.  We  proceed  largely  by  example,  specifying 
clean  execution  rules  for  successively  more  sophisticated  constructs.  The  method  is  defined  more  pre¬ 
cisely  in  Chapter  8.  Appendix  B  gives  a  clean  execution  definition  for  much  of  Euclid;  in  Chapter  8  we 
also  highlight  features  of  that  definition. 

In  Chapter  9  we  review  the  accomplishments  of  this  research.  After  making  some  observations 
about  the  method  and  its  use,  we  assess  the  contributions  of  this  work  and  suggest  directions  for  future 
research. 
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2.  SOURCES  OF  EXECUTION  ERRORS 

This  chapter  surveys  language  constructs  and  combinations  of  language  constructs  that  can  result  in 
execution  errors.  As  throughout  the  rest  of  this  thesis,  we  shall  assume  that  the  program  text  meets 
statically  enforceable  constraints.  This  assumption  allows  us  to  focus  on  the  problem  at  hand  and  is 
made,  usually  implicitly,  by  ail  methods  for  proving  correctness  properties.  Also,  we  shall  not  distin¬ 
guish  execution  errors  that  happen  to  be  statically  enforceable,  such  as 

X  2/0 

or 

var  x:  1..2; 

X  3 


Different  language  designers  have  made  different  decisions  not  only  about  which  features  to 
inciude,  but  aiso  about  which  aspects  of  features  to  proscribe.  For  example,  in  CLU  [Liskov  et  ai.  79] 
referencing  an  uninitialized  variable  is  an  error;  in  Euclid  [Popek  et  al.  77]  it  is  not.  Thus  we  can  talk 
about  an  execution  error  only  with  respect  to  a  given  language  definition. 

The  particular  decision  about  referencing  uninitialized  variables  is  of  only  secondary  concern  to  us. 
What  is  important  in  both  cases  is  that  the  language  designer  has  specified  what  is  legal;  legality  is  too 
important  an  issue  to  be  left  to  the  language  implementor.  Throughout  most  of  this  thesis  we  shall  try 
to  be  neutral  about  such  language  design  decisions,  concentrating  instead  on  providing  language 
designers  with  a  precise  notation  for  documenting  these  decisions. 

Two  sets  of  constraints  have  a  major  influence  on  which  constraints  are  statically  enforceable:  scop¬ 
ing  and  binding  constraints.  Scoping  constraints  define  where  identifiers  are  visible.  Binding  con¬ 
straints  define  how  identifiers  are  associated  with  objects  such  as  values,  types,  and  procedures.  In  For¬ 
tran  [ANSI  66],  for  example,  the  constraint  that  an  operand  have  a  scalar  value  would  be  statically 
enforceable.  In  APL  [Falkoff  and  Iverson  75]  both  scoping  and  binding  constraints  would  prevent  this 
constraint  from  being  statically  enforceable  in  general. 

For  most  of  this  thesis  we  shall  consider  execution  constraints  that  are  largely  independent  of  scop¬ 
ing  and  binding  constraints.  We  restrict  ourselves  for  now  to  static  scoping  and  objects  with  statically 
determinable  types.  (We  shall,  however,  allow  constants  in  declarations  to  be  bound  upon  scope 
entry.)  There  are  two  reasons  for  this  restriction.  First,  the  restriction  allows  a  rich  class  of  execution 
errors,  including  essentially  all  those  of  block-structured  languages  such  as  Fortran,  Algol  60  [Naur  63], 
PL/I  [ANSI  76],  Pascal  [Jensen  and  Wirth  74],  Euclid,  and  Modula  [Wirth  77a].  Second,  the  restric¬ 
tion  facilitates  the  largely  syntactic  method  for  generating  clean  execution  conditions  to  be  proposed. 
Similar  restrictions  are  exploited  by  most  methods  for  generating  verification  conditions.  In  Chapter  8 
we  shall  discuss  how  to  accommodate  other  execution  constraints  within  the  approach  taken  for  static 
scoping  and  statically  determinable  types. 

2.1  EXECUTION  ERRORS  AS  DOMAIN  ERRORS 

In  characterizing  execution  errors,  we  shall  view  the  programming  language  in  question  as  an 
expression  language:  a  program  is  an  expression,  some  of  whose  subexpressions  have  side-effects.  An 
assignment  operation  has  the  side  effect  of  assigning  a  value  to  a  variable;  a  variable  declaration  has  the 
side-effect  of  causing  the  variable  to  be  allocated.  Note  that  the  operator  involved  may  be  implicit,  as  is 
usually  the  case  for  the  referencing  operator. 
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In  this  view,  an  execution  error  results  from  violating  an  input  constraint  of  some  operator,  as  in 
dividing  by  zero  or  arithmetic  overflow.  Thus,  execution  errors  can  occur  whenever  an  operator  is  not 
total. 

Because  of  the  diversity  of  operators,  many  execution  errors  occur  in  expressions.  In  addition  to 
dividing  by  zero  and  arithmetic  overflow,  these  include  referencing  an  uninitialized  variable,  using  an 
array  index  that  is  out  of  range,  taking  the  square  root  of  a  negative  number,  and  taking  the  successor 
of  the  last  element  in  an  enumerated  type. 

Execution  errors  can  also  occur  in  statements,  such  as  type  incompatibility  in  assignment  and  pro¬ 
cedure  statements.  Many  sources  of  execution  errors  are  common  to  a  wide  variety  of  programming 
languages;  others  are  specific  to  a  particular  feature  in  a  particular  language.  For  example,  in  Fortran 
the  values  of  the  start,  final,  and  increment  parameters  of  the  iterative  DO  statement  must  be  non¬ 
negative.  These  parameters  can  be  variable  references  and  thus  are  a  source  of  execution  errors. 

There  are  relatively  fewer  execution  constraints  on  declarations.  One  class  of  errors  that  occur  in 
declarations  is  exceeding  size  limits.  For  example,  a  language  definition  might  specify  a  maximum  size 
for  arrays  or  an  upper  limit  on  the  size  of  the  base  type  for  sets.  In  languages  where  constants  in 
declarations  need  not  be  bound  until  scope  entry,  the  added  flexibility  means  that  a  declare  operation 
cannot  in  general  be  checked  statically. 

Such  size  constraints  may  seem  merely  implementation  details.  It  is  the  language  designer’s 
responsibility,  however,  to  identify  which  constraints  are  part  of  the  language  definition.  Thus,  it  must 
be  his  decision  whether  arithmetic  overflow  is  a  program  failure  or  an  implementation  headache. 

Since  the  variety  of  execution  errors  is  limited  only  by  the  language  designer’s  imagination,  we 
shall  have  to  content  ourselves  with  being  illustrative.  In  Chapters  5,  6,  and  7  we  shall  examine  the 
execution  errors  of  an  entire  language.  In  the  remainder  of  this  chapter  we  explore  four  classes  of 
more  subtle  execution  problems;  referencing  errors,  type  compatibility  errors,  interactions  of  language 
features,  and  aliasing. 

2.2  REFERENCING  ERRORS 

Attempting  to  reference  a  variable  can  result  in  any  of  a  number  of  errors:  an  array  index  may  be 
out  of  range,  a  record  component  selector  for  a  variant  record  may  select  a  field  that  does  not  exist  in 
the  current  variant,  a  pointer  identifying  a  dynamic  variable  may  be  null,  or  a  referenced  dynamic  vari¬ 
able  may  have  been  deallocated. 

Even  if  a  variable  is  properly  identified,  it  may  be  uninitialized.  Referencing  uninitialized  variables 
is  a  particularly  tricky  problem.  As  Hoare  [68]  notes. 

This  error  is  one  which  is  difficult  to  trace  to  its  source,  particularly  if  it  gives  rise  to  different 
results  on  different  runs  of  the  same  program,  and  hardware  faults  are  suspected.  For  users  of  a 
high-level  programming  language,  a  special  problem  arises,  since  the  results  of  the  run  are  in 
principle  unpredictable  within  the  terms  of  the  programming  language  itself,  and  can  only  be 
explained  in  terms  of  octal  dumps,  etc. 

Referencing  an  uninitialized  variable  is  more  basic  than  other  execution  errors  in  at  least  the  fol¬ 
lowing  sense:  before  we  can  constrain  the  range  of  legal  values  for  a  particular  use  of  a  variable,  we 
must  ensure  that  it  has  a  defined  value  to  be  constrained.  Even  viewing  the  error  as  a  domain  error  is 
somewhat  unsatisfactory:  an  uninitialized  variable  does  not  so  much  have  an  improper  value  as  no 
value  at  all. 
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Referencing  an  uninitialized  variable  is  distinguished  from  other  execution  errors  in  another  way; 
the  meaning  of  the  clean  execution  condition  depends  on  how  it  is  to  be  enforced.  For  verification,  the 
condition  that  a  variable  v  is  initialized  is  a  constraint  that  the  value  of  v  has  been  assigned  a  value  of 
the  the  proper  type,  i.e.,  v  IN  ItsType(v),  where  ItsType(v)  is  the  type  of  v.  If  there  is  a  previous  legal 
assignment  to  v,  then  the  first  v  in  v  IN  ItsType(v)  can  be  replaced  by  the  assigned  value  and  this 
assertion  simplified  to  TRUE. 

On  the  other  hand,  for  run-time  checking  all  that  can  be  checked  without  additional  mechanism  is 
whether  or  not  a  variable's  value  is  in  range,  here  v  IN  ItsType(v)  means  only  that  the  variable's  con¬ 
tents  represents  a  legal  value.  Since  the  value  of  a  variable  depends  on  the  previous  use  of  that 
storage,  checking  that  a  variable  is  in  range  is  a  weaker  check  than  that  the  variable  is  initialized. 
Moreover,  although  it  is  typically  straightforward  (if  expensive)  to  check  at  run-time  that  a  variable  is 
in  range,  this  property  cannot  be  expressed  statically;  it  only  has  meaning  in  terms  of  a  particular  imple¬ 
mentation. 

To  avoid  such  problems,  Dijkstra  made  referencing  an  uninitialized  variable  a  statically  checkable 
property  in  his  little  language  [Dijkstra  76].  He  syntactically  distinguishes  initial  assignment,  imposing  a 
discipline  whereby  each  variable  is  initialized  exactly  once.  All  guarded  commands  within  an  //  com¬ 
mand  initialize  a  variable,  or  none  do.  No  initialization  is  allowed  within  a  do  command.  All  variables 
inherited  from  the  immediately  enclosing  scope  are  identified  as  either  previously  initialized  or  requir¬ 
ing  initialization  within  this  scope.  Although  Dijkstra  adds  considerable  mechanism  to  an  austerely 
beautiful  language  to  eliminate  uninitialized  variables,  the  only  sources  of  execution  errors  that  remain 
are  division  by  zero  and  array  indices  being  out  of  range! 

2.3  TYPE  COMPATIBILITY  ERRORS 

Early  programming  languages  such  as  Algol  60  and  Fortran  had  simple  data  types  and  data  structur¬ 
ing.  Since  they  also  usually  required  fairly  strict  type  compatibility,  type  compatibility  checking  was 
straightforward. 

Severe  complications  can  arise  in  languages  such  as  PL/I  that  do  not  require  strict  type  compatibil¬ 
ity.  For  example,  although  it  is  often  convenient  to  be  able  to  treat  a  character  string  as  a  number  (or 
vice  versa),  it  is  undecidable  whether  or  not  a  string  will  be  a  valid  numeral.  Type  compatibility  check¬ 
ing  in  PL/I  is  complicated  by  automatic  conversions  and  problems  related  to  pointers,  such  as  dangling 
references,  pointers  pointing  to  static  variables,  and  the  lack  of  constraints  on  the  types  of  values 
pointers  can  point  to. 

Even  with  fairly  strict  type  compatibility,  sources  of  execution  errors  remain.  Most  programming 
languages  allow  the  programmer  to  define  his  own  types,  such  as  by  building  record  types  or  defining 
enumerated  or  subrange  types.  User  defined  subranges,  even  if  statically  defined,  are  a  source  of  type 
compatibility  errors.  For  example,  in  an  assignment  statement  in  Pascal  the  value  of  the  expression 
may  lie  outside  the  type  of  the  variable  to  be  assigned  to.  Similarly,  the  type  of  a  variable  used  as  an 
argument  may  be  incompatible  with  the  type  of  the  corresponding  formal  parameter.  Thus, 

var  x:  l..n; 
x  ;*  3*a-l-b; 

is  legal  only  if  1  ^  3*a+b  <  n.  Although  we  can  always  check  statically  that  the  types  involved  have 
the  right  forms,  we  cannot  in  general  check  the  values  of  expressions  used  until  the  program  is  run. 
Note  that  being  able  to  check  even  the  forms  of  types  depends  on  our  assumptions  about  scoping  and 
binding. 
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2.4  INTERACTIONS  OF  LANGUAGE  FEATURES 

Most  sources  of  execution  errors  are  problems  of  individual  language  constructs.  More  pernicious 
are  those  that  result  from  interactions  of  features. 

For  example,  other  language  features  can  compound  the  problems  of  ensuring  type  compatibility. 
We  have  already  mentioned  how  this  occurs  in  PL/I.  In  Pascal,  perhaps  the  most  frequently  cited 
deficiency  is  the  requirement  that  array  bounds  be  statically  determined  together  with  the  strict  type 
compatibility  required  of  index  types  for  formal  and  actual  array  parameters  (Wirth  75].  This  makes  it 
impossible,  for  example,  to  have  a  sorting  or  matrix  multiplication  routine  that  accepts  arbitrary  arrays 
as  arguments.  Partially  in  response  to  this  problem,  some  more  recent  languages,  including  Ada  iDoD 
80],  allow  such  bounds  to  be  constants  that  are  fixed  upon  scope  entry  rather  than  being  fixed  statically. 
This  directly  addresses  the  Pascal  weakness,  but  also  opens  the  door  for  execution  errors.  We  can  no 
longer  check  statically  constraints  involving  the  value  of  such  a  constant,  or  even  guarantee  that  the 
constant’s  definition  assigns  a  valid  value  to  the  constant. 

Still  subtler  problems  remain.  Consider  the  following  situation,  cited  in  [Wortman  79]  from  an 
earlier  version  of  Euclid.  Euclid  introduced  the  idea  of  clean  execution  condition  with  its  legality  asser¬ 
tions,  and  provides  for  their  run-time  enforcement  with  the  checked  option.  Euclid  also  provides  for 
explicit  control  over  visibility  of  identifiers.  Except  for  pervasive  identifiers,  an  identifier  is  visible  in  a 
routine  or  module  if  and  only  if  it  is  explicitly  imported.  Similarly,  an  identifier  declared  within  a 
module  is  visible  outside  if  and  only  if  it  is  explicitly  exported.  This  restricted  accessibility  of  identifiers 
conflicts  with  the  need  to  use  identifiers  to  express  clean  execution  conditions.  Consider,  for  example, 

const  n  arbitraryExpression 
type  T  *  l..n 
procedure  P  =“ 

imports  (T)  {but  not  n] 
begin 

const  n  :=*  23  (obscures  outer  n} 
var  x:  T 

var  y:  1..10  3 

assert  ? 

X  :=  y 
end  P; 

The  condition  to  be  asserted  is  3  ^  n.  The  "n",  however,  is  "n”  of  "T",  which  is  inaccessible  at  that 
point  because  it  is  not  imported  into  "P".  Moreover,  the  declaration  of  "n"  in  "P"  obscures  the  outer  "n" 
so  that  it  could  not  be  imported. 

This  example  is  an  instance  of  a  larger  problem.  The  desired  assertion  is  unspeakable,  there  is  no 
legal  way  to  express  the  needed  assertion.  Other  combinations  of  language  features  can  also  cause 
unspeakable  assertions. 

Unanticipated  language  feature  interactions  such  as  this  are  particularly  displeasing.  Although  the 
possibility  of  an  out-of-bounds  array  reference  seems  a  reasonable  (and  essentially  unavoidable)  price  to 
pay  for  arrays,  language  design  "bugs"  such  as  the  above  can  be  hard  for  the  language  designer,  as  well 
as  the  implementor  and  the  user,  to  recognize  and  deal  with. 

This  last  example  proves  little  more  than  the  fallibility  of  language  designers.  The  example  is 
language  dependent  and  a  bit  picky,  especially  since  the  language  has  been  changed  to  eliminate  this 
bug.  But  in  another  way,  the  pickiness  and  language  dependence  are  exactly  the  point!  Other  than 
general  principles  such  as  cleanness  and  simplicity,  the  language  designer  has  had  little  help  in  avoiding 
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.such  problems.  As  a  result,  even  very  carefully  designed  programming  languages  such  as  Pascal  have 
had  subtle  problems  not  unearthed  until  well  after  the  langauge  was  in  production  use.  What  is  needed 
is  a  notation  for  expressing  clean  execution  and  a  discipline  that  forces  the  language  designer  to  address 
clean  execution  issues,  and  especially  interactions  of  language  features,  during  language  design. 

2.5  ALIASING 

Two  variable  identifiers  are  said  to  be  aliases  if  they  refer  to  the  same  variable.  For  example,  the 
array  element  names  a(i)  and  a(j)  are  aliases  when  i“j,  and  within  the  body  of  a  procedure  a  variable 
parameter  name  and  the  corresponding  argument  name  are  aliases. 

Aliasing  makes  reasoning  about  programs  harder  because  explicitly  changing  the  value  associated 
with  one  identifier  may  implictly  change  the  values  associated  with  others.  Thus,  keeping  track  of  the 
values  of  variables  also  requires  keeping  track  of  the  names  of  variables. 

The  problems  are  particularly  acute  in  formal  reasoning,  especially  for  assignments  and  procedure 
invocations.  Almost  all  axiomatic  proof  methods  assume  that  distinct  identifiers  represent  distinct  vari¬ 
ables.  For  example,  the  prevalent  assignment  axiom  [Hoare  69); 

I 

P{e/x)  {x  :=  e}  P 

depends  critically  on  the  assumption  that  the  assignment  to  "x"  does  not  affect  other  variables. 

Let  us  consider  a  Pascal  example  [Giordano  80]: 
var  a:  integer; 

procedure  initCvjr  x,  y:  integer); 
begin 
x:=*0; 

y:-l; 

end, 

init(a,  a); 

It  is  straightforward  to  establish 

TRUE  D  begin  x:=*0;  y:=*l;  end,,  x-0  &  y»l) 

Using  the  usual  rule  of  substitution  for  procedures,  we  can  conclude 
TRUE  D  wp(init(a,  a),  a“0  &  a“l) 

This  clearly  violates  what  Dijkstra  calls  the  Law  of  the  Excluded  Miracle. 


If  we  are  to  use  a  simple  rule  of  substitution,  arguments  must  be  distinct  variables.  This  restriction 
is  easy  to  enforce  statically  for  entire  variables,  but  cannot  in  general  be  enforced  until  run-time  for 
array  elements  and  dynamic  variables.  If  "a"  is  an  array,  the  arguments  in  p(a,a[i])  always  overlap, 
while  those  in  p(a[i].a[j])  overlap  only  when  i“j.  The  Pascal  Report  [Jensen  and  Wirth  74]  does  not 
proscribe  aliasing,  and  there  is  thus  no  corresponding  source  of  execution  errors.  The  formal  definition 
of  Pascal  [Hoare  and  Wirth  73],  however,  defines  a  different  language  than  does  the  Pascal  Report,  and  a 
ban  on  aliasing  is  a  major  difference. 
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To  allow  assignments  to  be  defined  by  a  simple  rule  of  substitution,  Euclid  forbids  aliasing,  intro¬ 
ducing  the  possibility  of  execution  errors  in  order  to  restrict  the  class  of  legal  programs.  (Long  before 
the  problems  of  aliasing  were  heralded,  the  designers  of  Fortran  insisted  that  aliasing  be  explicit  in 
EQUIVALENCE  statements  and  forbade  procedure  and  function  arguments  overlapping.  Their  motiva¬ 
tion  was  to  simplify  formal  reasoning  of  a  different  kind:  the  language  definition  made  explicit  the 
assumptions  needed  for  a  straightforward  optimizing  compiler  to  do  its  job.) 

[Cartwright  and  Oppen  78]  and  [Schwartz  80]  allow  programs  with  aliasing.  To  do  so,  they  separate 
the  concepts  of  identifier  and  variable,  but  at  the  price  of  adding  considerable  complexity  to  their  proof 
rules.  In  his  forthcoming  book  [Reynolds  80],  Reynolds  takes  a  similar  approach.  He  treats  aliasing  as 
an  instance  of  the  larger  problem  of  interference  of  variables,  using  a  more  complex  program  logic  than 
that  traditionally  used  in  Hoare-like  systems  in  order  to  describe  how  the  meaning  of  a  specification 
depends  on  its  environment. 

2.6  CONCLUDING  REMARK 

0 

Execution  errors  are  a  diverse  lot,  united  only  by  the  difficulty  of  detecting  them  statically.  This 
diversity  and  the  subtlety  of  many  execution  errors  again  argue  for  a  rigorous  way  of  considering  clean 
execution  issues  during  language  design. 
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3.  ENFORCING  EXECUTION  CONSTRAINTS 

A  programming  language  implemenlaiion  has  the  responsibility  to  enforce  all  language  constraints. 
If  it  chooses  not  to  enforce  a  constraint  on  a  construct  statically,  by  proving  that  the  constraint  is  always 
satisfied,  it  must  enforce  the  constraint  dynamically,  by  checking  that  the  constraint  is  satisfied  each 
time  the  construct  is  executed. 

There  are  efficient  static  techniques  for  enforcing  most  statically  enforceable  constraints.  This 
chapter  contrasts  dynamic  and  static  techniques  for  enforcing  execution  constraints  and  surveys  previ¬ 
ous  work  on  proving  the  absence  of  execution  errors. 

3.1  THE  ALTERNATIVE  OF  RUN-TIME  CHECKING 

Run-time  checking  and  proving  the  absence  of  execution  errors  are  complementary  enforcement 
techniques  with  competing  advantages.  For  a  given  construct  (and  a  given  implementation  strategy) 
either  approach  can  be  quite  costly.  [Fischer  and  LeBlanc  80]  and  [Wortman  79]  provide  very  nice 
examples,  for  Pascal  and  Euclid  respectively,  of  how  run-time  checking  for  large  classes  of  execution 
errors  can  be  imolemented  efficiently. 

Although  most  language  implementations  have  included  heuristics  for  proving  the  absence  of  some 
execution  errors,  enforcement  in  the  past  has  been  based  on  run-time  checking.  Indeed,  much  of  the 
work  that  has  been  done  on  proving  the  absence  of  execution  errors  has  been  done  in  the  name  of 
optimizing  run-time  checking. 

The  choice  between  static  and  dynamic  checking  cannot  be  based  solely  on  the  relative  implemen¬ 
tation  costs.  An  implementation  should  also  be  as  informative  as  possible  in  reporting  when  an  error 
prevents  the  program  from  executing  properly.  If  no  errors  are  reported,  the  user  should  be  able  to 
assume  that  none  occurred.  The  language  implementation  thus  has  an  important  part  to  play  in  helping 
the  user  separate  legality  and  correctness  concerns. 

At  execution  time  it  is  harder  to  reconstruct  the  source  of  an  error.  It  can  often  be  done,  but  usu¬ 
ally  only  at  great  cost.  The  program  fault  that  caused  an  execution  error  can  be  arbitrarily  far  removed, 
both  in  the  program  text  and  in  the  program's  execution  history.  The  user  must  then  debug  his  pro¬ 
gram  with  respect  to  clean  execution  properties,  as  he  would  for  correctness  properties,  using  what  he 
knows  of  the  program’s  aborted  execution  history. 

However  time-honored  this  process  is,  an  implementation  can  be  much  more  informative  in 
reporting  an  error  if  it  can  detect  the  error  statically.  Not  only  can  it  note  where  the  error  would  have 
occurred,  but  it  can  also  try  to  trace  the  error  back  to  its  source  in  the  program  text.  In  doing  so.  it 
may  be  able  to  combine  information  about  several  possible  execution  errors  or  to  trace  the  source  of 
the  potential  error  back  to  the  start  of  the  enclosing  routine,  thus  discovering  a  precondition  on  the 
routine’s  legal  invocation. 

However  they  are  enforced,  clean  execution  conditions  must  be  derived.  Thus,  this  thesis  is  appli¬ 
cable  to  both  enforcement  approaches. 

3.2  OTHER  WORK 

In  the  rest  of  this  chapter  we  survey  other  work  on  proving  clean  execution.  We  shall  not  discuss 
work  restricted  to  an  individual  clean  execution  problem,  such  as  Suzuki  and  Ishihata’s  system  [77]  for 
checking  that  array  subscripts  are  in  range. 
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^Sites’s  Work 

Sites  [74a,  74b]  identified  proving  the  absence  of  execution  errors  as  a  well  defined  subgoal  in  rea¬ 
soning  that  a  program  works  correctly.  His  aim  was  to  prove  both  clean  execution  and  termination, 
which  he  called  clean  termination. 

Sites  dealt  with  programs  as  flow  graphs  labelled  with  program  assertions  (clean  termination  condi¬ 
tions).  By  inspection,  he  converted  a  program  in  very  simple  Algol-like  language  to  a  flow  graph  and 
labelled  the  flow  graph  with  clean  termination  conditions.  He  then  used  a  flow  graph  algorithm  to  ver¬ 
ify  the  clean  termination  conditions  produced. 

The  flow  graph  algorithm  visited  the  nodes  of  the  graph  in  a  fixed  order,  manipulating  the  labels  of 
the  graph  as  it  went.  After  all  nodes  were  visited,  each  clean  termination  condition  was  either  true  (the 
program  would  always  terminate  cleanly),  false  (the  program  would  not  always  terminate  cleanly),  or 
unresolved  (the  excluded  middle  —  the  theorem  prover  had  given  up). 

Loops  were  the  only  construct  to  introduce  the  possibility  of  non-termination  and  were  thus  the 
central  concern  of  his  algorithm.  The  basic  form  of  the  assertion  generated  for  a  loop  is  "there  exists  a 
k  such  that  on  the  kth  iteration  of  the  loop  one  of  the  exit  arcs  will  be  taken”.  (His  assertion  language 
was  not  completely  formal.)  He  developed  heuristics  for  finding  monotonically  decreasing  sequences 
within  loops  and  for  detecting  other  patterns  of  loop  use  for  which  termination  could  be  established. 
He  also  developed  techniques  for  untangling  loops  and  eliding  assertions,  extending  existing  interval 
analysis.  Although  the  larger  goals  of  Sites’s  work  were  very  close  to  those  of  this  research.  Sites  dealt 
exclusively  with  verifying  clean  termination,  deriving  clean  termination  conditions  purely  by  inspection. 

Cousot  and  Cousot's  Work 

Cousot  and  Cousot  [76]  used  an  abstract  interpretation  of  programs  to  analyze  range  constraints  on 
integer  variables.  Their  abstract  interpretation,  based  on  lattice  theory,  represented  concrete  values  as 
subranges  of  possible  values,  interpreting  the  basic  operations  of  the  language  accordingly.  This 
approach  was  extended  by  Cousot  and  Cousot  [78]  to  recursive  programs  and  by  Cousot  and  Haibwachs 
[78]  to  a  wider  class  of  linear  constraints  among  program  variables,  including  overflow,  subrange,  and 
array  bound  checking.  As  was  Sites,  Cousot  and  Cousot  were  concerned  only  with  verification. 

The  Programming  Language  Euclid 

Sites  viewed  clean  termination  as  a  distinguishable  part  of  correctness.  The  designers  of  the  pro¬ 
gramming  language  Euclid  identified  clean  execution  as  a  property  of  a  language  specification  and  not  of 
a  program  specification. 

The  Euclid  Report  [Lampson  et  al.  77]  confronted  the  problems  of  clean  execution  directly,  intro¬ 
ducing  the  notion  of  legality: 

Throughout  this  report,  various  restrictions  are  placed  on  legal  Euclid  programs.  Many  of  these 
restrictions  cannot  be  checked  syntactically,  and  in  some  cases  they  involve  dynamic  conditions 
that  are  difficult  (or  impossible)  to  check  statically.  Nevertheless,  programs  that  violate  them  are 
not  considered  to  be  meaningful  Euclid  programs.  It  is  the  responsibility  of  the  compiler  to  ver¬ 
ify  as  many  of  these  properties  as  it  can,  and  to  produce  Boolean  expressions  called  legaliry  asser¬ 
tions  for  those  it  cannot.  Thus,  any  program  whose  legality  assertions  can  all  be  verified  is  a  legal 
Euclid  program,  with  well-defined  semantics. 

Legality  assertions  are  a  form  of  clean  execution  condition.  The  Euclid  designers  provided  for 
enforcing  legality  assertions,  but  made  the  enforcement  optional  with  the  checked  option.  If  specified, 
the  checked  option  causes  legality  assertions  to  be  compiled  into  run-time  checks.  [Wortman  79] 
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.discusses  how  legality  assertion  checking  could  be  done  in  a  single  pass  Euclid  compiler.  [Patkau  78] 
presents  more  detailed  algorithms,  still  for  one-pass  compilation. 

There  are.  however,  inconsistencies  in  the  decisions  made  by  the  Euclid  designers  about  what  a 
legal  program  is.  Most  notably,  referencing  an  uninitialized  variable  and  referencing  an  object  via  an 
invalid  pointer  were  not  forbidden.  The  Euclid  designers  argued  that  "verification  generally  places 
stronger  constraints  on  variables  (pointers)  than  that  they  merely  have  valid  values  when  they  are  used 
-  they  must  have  suitable  values"  iPopek  et  al.  77).  This  argument  fails  to  address  the  complications  of 
uninitialized  variables  for  run-time  checking.  Moreover,  the  argument  can  be  applied  to  all  execution 
errors:  if  a  program  is  correct,  it  must  be  legal.  This  all  or  nothing  approach  provides  little  information 
if  the  proof  of  correctness  fails. 

More  importantly,  the  description  of  legality  assertions  was  informal.  (And.  for  example,  it  is  only 
by  omission  in  the  Euclid  Report  that  one  can  conclude  that  referencing  an  uninitalized  variable  is 
legal.)  This  thesis  can  be  viewed  as  an  attempt  to  generalize  and  formalize  the  notion  of  legality  asser¬ 
tions  and  their  defining  instances  in  the  Euclid  Report. 

The  Rune  heck  Verifier 

The  only  verification  system  for  proving  the  absence  of  a  wide  class  of  execution  errors  is  the  Run- 
check  Verifier  [German  78).  The  Runcheck  Verifier  extends  the  Stanford  Pascal  Verifier’s  semantics  of 
Pascal  [Luckham  et  al.  79]  to  guard  against  referencing  an  uninitialized  variable,  using  an  array  sub¬ 
script  that  is  out  of  range,  subrange  type  errors,  referencing  a  value  via  a  nil  pointer,  arithmetic 
overflow,  and  dividing  by  zero. 

The  Runcheck  Verifier  has  been  used  to  verify  the  absence  of  these  run-time  errors  (except 
overflow  in  some  cases)  for  a  number  of  programs,  including  a  variety  of  sorting  and  searching  pro¬ 
grams,  a  spanning  tree  algorithm,  the  Schorr-Waite  list  marking  algorithm  [Knuth  68],  and  a  program 
to  solve  the  eight  queens  problem  [Wirth  71].  The  system  works  fully  automatically  on  programs  anno¬ 
tated  with  pre/postconditions  and  invariants,  automatically  applying  several  methods  to  construct  or 
strengthen  these  assertions.  These  include  automatically  documenting  the  program  with  assertions 
about  numeric  values  and  data  structures,  analyzing  the  program  for  loops  that  initialize  arrays,  and 
strengthening  invariants  based  on  unproved  verification  conditions. 

German’s  work  is  encouraging  evidence  that  clean  execution  conditions  once  derived  can  be 
verified.  [German  78]  does  not,  however,  include  a  precise  definition  of  the  clean  execution  conditions 
to  be  verified.  Moreover,  the  derivation  of  clean  execution  conditions  is  tied  to  a  particular  language; 
no  help  is  given  for  extending  the  approach  to. other  languages. 

Eggert’s  yVork 

t 

Eggert  [80]  proposes  to  eliminate  execution  errors  by  strengthening  the  type  system  so  that  what 
would  be  execution  errors  become  statically  checkable  type  errors.  The  key  idea  is  to  introduce  sub- 
types  that  identify  legal  uses  of  a  type,  using  Algol  68-like  conformity  case  statements  to  ensure  that 
values  are  of  the  proper  subtypes  for  their  uses.  Thus,  to  ensure  in  Pascal  that  a  null  pointer  is  not 
used  to  identify  a  dynamic  variable  of  type  T,  the  subtype  of  non-null  pointers  to  T  would  be  intro¬ 
duced. 

There  is  a  strong  methodological  argument  to  be  made  for  this  approach:  it  forces  the  programmer 
to  document  with  the  types  he  uses  why  his  program  is  free  of  execution  errors.  This  argument,  how- 
ever,  also  points  up  a  basic  flaw  in  the  approach:  the  programmer  must  do  all  the  work.  Using  confor¬ 
mity  case  statements  ensures  only  that  the  language  interpreter  can  be  simple-minded.  This  approach  is 
thus  a  step  backwards  along  what  the  Euclid  designers  [77]  saw  as  "one  of  the  main  lines  of  current 
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•programming  language  development:  tranferring  more  and  more  of  the  work  of  producing  a  correct  pro¬ 
gram,  and  verifying  its  correctness,  from  the  programmer  and  the  verifier  (human  and  mechanical)  to 
the  language  and  its  compiler". 

The  problems  are  well  illustrated  for  arithmetic.  If  what  would  normally  be  an  integer  subexpres¬ 
sion  might  result  in  overflow,  it  must  yield  instead  a  value  of  a  type  such  as 

overinteger  =  record  case  overflow:  boolean  of 
false:  (data:  integer); 
true:  0 
end 

The  program  must  then  be  made  more  complex  to  guard  against  an  arithmetic  operation  being  applied 
to  a  value  of  type  "overinteger",  and  more  complex  yet  to  guard  against  dividing  by  zero.  Thus,  the 
programmer  must  embed  the  logic  to  describe  exceptional  cases  within  the  program  itself,  rather  than 
being  able  to  consider  exceptional  cases  separately. 

This  static  approach  is  also  inherently  limited:  all  errors  must  be  expressed  as  type  errors.  It  does 
not  address  important  classes  of  execution  errors  such  as  aliasing  and  referencing  uninitialized  variables. 

Coleman  and  Hughes’s  l^ork 

Coleman  and  Hughes  [79]  addressed  the  problem  of  proving  program  properties  given  the  finite 
bounds  of  real  computers.  They  proposed  including  bounds  constraints  with  machine-dependent 
parameters  in  the  axiomatic  description  of  Pascal  [Hoare  and  Wirth  73],  thus  allowing  properties  to  be 
proved  for  any  machine  that  met  the  specified  constraints.  Their  derivation  of  clean  execution  condi¬ 
tions  for  integer  constraints  is  informal,  and  where  thorough,  is  a  somewhat  cumbersome  enumeration 
of  cases.  This  enumeration  of  cases  is  quite  naturally  expressed  in  a  formalism  tied  to  the  underlying 
grammar.  We  shall  introduce  such  a  formalism  in  the  next  chapter. 
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4.  A  REWRITE  RULE  APPROACH  TO  DEFINING  CLEAN  EXECUTION 

This  chapter  introduces  a  method  based  on  rewrite  rules  for  deriving  a  clean  execution  condition 
from  a  piece  of  program  text.  In  the  sections  that  follow,  we  define  the  assertion  language  to  be  used, 
review  the  basic  concepts  of  rewrite  rules,  illustrate  the  method  using  a  simple  arithmetic  expression 
language,  and  discuss  how  the  meaning  of  clean  execution  conditions  depends  on  how  they  are 
enforced. 

4.1  AN  ASSERTION  LANGUAGE 

In  order  that  new  verification  techniques  not  be  required,  we  shall  insist  that  clean  execution  con¬ 
ditions  be  expressed  in  the  same  terms  as  other  program  properties.  Intuitively,  the  assertion  language 
consists  of  the  usual  Boolean  operators,  universal  quantification  over  the  vaiues  of  the  programming 
language  in  question,  and  simple  propositions  such  as 

TRUE 
3*a-rb  <  n 

color  IN  (red,  green,  blue) 

More  precisely,  the  structure  of  the  assertion  language  is: 

assertion  =*  term 
I  assertion 

I  ’('  assertion  binBoolOpr  assertion  ')’ 

I  ’(’  V  id  ’)’  assertion 
term  »  TRUE  |  FALSE 
I  expr  compOpr  expr 
I  expr  IN  type 
I IDEQ  ’(’  id  id  ’)’ 
binBoolOpr  - 
compOpr  -  ’-’I '5^’ I 

"expr",  "type",  and  "id"  correspond  to  values,  types,  and  identifiers  of  the  programming  language. 

For  clarity,  different  fonts  will  be  used  to  distinguish  different  kinds  of  identifiers  and  keywords. 
Program  identifiers  and  rewrite  rule  variables  will  appear  in  lower  case  roman  (with  the  first  letters  of 
words  within  identifiers  capitalized),  as  in  "n"and  "arithExpr".  Programming  language  keywords  will 
appear  in  lower  case  italics,  as  in  var  and  end.  Assertion  language  keywords  will  appear  in  upper  case 
boldface,  as  in  TRUE  and  IN.  Rewrite  function  identifiers  will  appear  in  boldface,  mostly  lower  case, 
as  in  clean  and  IndexType. 

As  is  customary  in  program  proving  methods,  we  shall  use  expressions  and  types  in  the  program¬ 
ming  language  to  denote  values  and  types  in  the  assertion  language.  To  be  precise,  basic  operators  in 
the  assertion  language  such  as  IN  and  take  as  operands  values  and  types  of  the  assertion  language. 

If  we  enforce  clean  execution  conditions  with  run-time  checks  expressed  in  the  programming 
language  itself,  no  more  expressive  power  is  required  for  the  values  and  types  of  the  assertion  language 
than  the  values  and  types  of  the  programming  language.  If,  on  the  other  hand,  we  verify  clean  execu¬ 
tion  conditions,  the  assertion  language  may  require  other  values  of  the  theories  involved.  In  particular, 
reasoning  about  numbers  may  require  that  the  assertion  language  include  the  integers  (or  reals)  and  not 
just  the  finite  subrange  included  in  the  programming  language. 
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For  simplicity  the  grammar  requires  binary  Boolean  operations  to  be  fully  parenthesized. 
Parentheses  may  be  omitted  where  unambiguous,  assuming  the  usual  precedence  of  Boolean  operators. 

IDEQ,  which  is  the  only  somewhat  unusual  feature,  checks  whether  two  identifiers  are  the  same, 
(id  “  idl  is  a  comparison  of  values.)  IDEQdd,  idl)  will  usually  be  a  static  property. 

4.2  REWRITE  RULES  AS  A  FORMALISM 

We  shall  define  clean  execution  conditions  for  language  constructs  using  rewrite  rules  derived  from 
the  underlying  structure  of  the  programming  language.  The  definition  is  recursive:  the  clean  execution 
of  a  construct  is  specified  in  terms  of  the  clean  execution  of  component  constructs  and  propositions  in 
the  underlying  assertion  language.  Thus,  when  applied  to  a  given  program  construct,  the  rewrite  rules 
produce  the  corresponding  clean  execution  condition. 

Rewrite  rules  have  been  used  in  a  variety  of  contexts.  The  algebraic  data  type  specifications  of 
Guttag  [Guttag  75,  77;  Guttag  et  al.  78]  are  a  particularly  elegant  application.  [Musser  80]  provides  a 
very  nice  introduction  to  the  theory  of  rewrite  rules  and  several  applications.  [Huet  and  Oppen  80]  is  a 
comprehensive  survey  of  the  different  semantics  and  proof  theories  given  for  rewrite  rules. 

The  use  of  rewrite  rules  here  is  quite  stylized.  We  shall  not  attempt  to  extend  the  theory  x)f 
rewrite  rules,  but  rather  to  restrict  the  way  rewrite  rules  are  used  to  provide  an  effective  way  of  deriving 
clean  execution  conditions. 

The  purpose  of  the  rewrite  rules  is  to  translate  a  term  of  the  form  clean  (program)  to  one  in  the 
assertion  language.  Before  we  can  discuss  the  particulars  of  that  translation,  we  must  define  the  formal¬ 
ism  to  be  used.  In  the  rest  of  this  section  we  define  the  basic  concepts  of  rewrite  rules  and  sketch  how 
we  shall  use  them.  The  development  of  basic  concepts  is  taken  from  [Musser  80]. 

We  first  define  expressions,  variables,  substitutions  of  expressions  for  variables,  and  rewrite  rules. 
These  basic  concepts  will  be  used,  often  implicitly,  in  defining  clean  execution  conditions.  We  then 
define  reducibility  and  two  well-formedness  properties:  unique  termination  and  finite  termination.  The 
effect  of  applying  rewrite  rules  is  defined  via  reducibility;  unique  and  finite  termination  are  properties  of 
sets  of  rewrite  rules  needed  to  show  that  a  set  of  rewrite  rules  produces  the  desired  result. 

Expressions 

Given  a  set  of  symbols  called  names^  expressions  are  defined  recursively:  an  expression  is  either  (1) 
a  name  v  or  (2)  a  name  f  and  a  sequence  of  n^O  expressions  ep...,e  .  v  is  called  a  variable.  (We  shall 
use  the  term  variable  to  apply  both  to  program  variables  and  to  variables  in  rewrite  rules.  Both  uses  are 
well  established  within  their  respective  domains,  and  no  ambiguity  should  result.)  f  is  called  the 
(main)  operator  of  this  expression,  and  e|,...,ej^  are  called  the  arguments  of  f;  such  an  expression  is  usu¬ 
ally  written  f(ej,...,ej^).  An  expression  with  no  arguments  is  written  fO,  distinguishing  it  from  vari¬ 
ables.  Expressions  are  essentially  finite  ordered  trees,  and  these  denotations  should  be  regarded  as 
merely  a  linear  notation  for  trees. 

We  assume  it  possible  to  test  names  for  equality.  Two  expressions  ej  and  ^2  identical,  written 
ej  =*=  ^2'  ^^®y  t)Oth  the  same  variable  or  they  both  have  the  same  main  operator  and 

identical  arguments  in  the  same  order. 

The  subexpressions  of  an  expression  are  (1)  the  entire  expression  and  (2)  the  subexpressions  of  the 
arguments  (if  any)  of  the  expression.  Thus,  variables  and  constants  have  no  subexpressions  other  than 
themselves. 
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The  variable  set  of  an  expression  e  is  (e)  if  e  is  a  variable,  and  otherwise  is  the  union  of  the  vari¬ 
able  sets  of  the  arguments  of  e.  Thus,  the  variable  set  of  a  constant  is  empty. 

Substitutions 

A  substitution  of  expressions  for  variables  is  a  mapping  cr  such  that 

cr(v)  —  =«  V  for  all  but  a  finite  number  of  variables  v  and, 
cr(f(ej,...,ej^))  *  l(cr(e j),...,crvej^)) 

We  denote  the  substitution  cr  such  that 

cr{v.)  ="=“61  for  i“l,...,n,  and 
cr(v)  =»»  V  otherwise 

by  o-  -  (cj /or  Vj,...,ej^/or  v^]. 

An  expression  ej  has  the  form  of  an  expression  e^  if  there  is  a  substitution  cr  such  that 
o-(e2)  Cj.  For  example, 

iffhenElseCswitch,  p(max,  y),  :*(max,  x)) 

has  the  form  of 

ifThenElse(expr,  stmtSeql,  stmtSeq2) 

by  the  substitution  a  =  [switch  for  expr,  p(max,  y)  for  stmtSeql,  :-(max,  x)  for  stmtSeq2l.  Note  that 
"has  the  form  oP  is  not  a  symmetric  relation. 

Rewrite  Rules 

A  rewrite  rule  is  an  ordered  pair  of  expressions,  written  L  R,  such  that  the  variable  set  of  R  is 
contained  in  the  variable  set  of  L. 

Let  RR  be  a  finite  set  of  rewrite  rules.  An  expression  ej  is  reducible  with  respect  to  RR  if  there  is 
a  rule  L  R  in  RR  and  a  subexpression  e’  of  ej  such  that  e’  has  the  form  of  L.  (For  most  of  what 
follows,  we  shall  assume  a  given  RR.)  Letting  cr  be  a  substitution  such  that  cr(L)  =  =«  e',  we  say  ej 
reduces  directly  to  e2,  where  e2  is  the  expression  obtained  by  replacing  one  occurrence  of  e’  by  cr(R)  in 
ej.  We  write  this  as  ej  e2.  If  Cj  e2  is  false  for  all  expressions  e2,  then  ej  is  irreducible. 

Let  be  the  relation  on  pairs  of  expressions  that  is  the  reflexive,  transitive  closure  of 

e  e’  if  and  only  if  there  are  expressions  ep,  e|,...,e^  where  n^O  such  that  e  =■  =  Cq,  ej  ej_^j  for 

i  —  0,...,n-l  and  e„  *  *  e’.  We  read  e  e^  as  e  reduces  to  e’. 

n 

Let  "  — be  the  relation  such  that  e^  *  >  ^2  if  and  only  if  ej  ->*  e2  and  e2  is  irreducible.  We 

read  ^2  ®1  to  ^2'  ^  set  of  rewrite  rules  RR  has  the  unique  termination  property  (also 

called  the  Church-Rosser  property)  if  and  only  if  the  corresponding  relation  "=>"  is  a  function,  that  is, 
whenever  “>  e2  and  e.  “>  e^  we  have  ^2  ~  ^  ®3’  function  the  meaning  function 

for  RR,  and  denote  it  by  /xlRR]  or  simply  p..  Thus,  if  e^  =*>  ^2'  write  p.[RR]iQ^)  =  e2  or  pt(ej) 
*  ^2-  Note  that  if  ej  e2,  then  ^(ej)  =■  =  p.iQ2^- 


-18- 


A  set  of  rewrite  rules  RR  has  the  finite  termination  property  if  and  only  if  there  is  no  infinite 
sequence  ®o  ^  If  a  set  of  rewrite  rules  RR  has  both  the  unique  and  finite 

termination  properties,  the  meaning  function  *  /x  is  a  total  function.  We  call  such  a  set  of 

rewrite  rules  convergent  [Musser  80].  In  Chapter  8  we  shall  define  what  it  means  for  a  definition  of 
clean  execution  to  be  proper,  including  that  the  set  of  rewrite  rules  be  convergent. 

Our  Application  of  Rewrite  Rules 

In  characterizing  execution  errors  in  Chapter  2,  we  viewed  programs  as  expressions.  We  now  make 
this  view  precise.  Programming  language  identifiers  and  constants  are  names.  Programming  language 
keywords  and  special  symbols  are  main  operators.  A  program  and  its  constituent  parts  are  thus  expres¬ 
sions.  We  shall,  however,  retain  programming  language  syntax.  (We  shall  ignore  two  purely  syntactic 
difficulties  with  this  view:  the  lack  of  an  explicit  operator  for  some  operations,  such  as  the  referencing 
operation,  and  there  being  more  than  one  keyword  in  some  operations,  such  as  if-then-else  state¬ 
ments.)  One  effect  of  this  is  to  choose  a  particular  abstract  syntax  for  program  segments. 

Assertion  language  keywords  and  operators,  as  well  as  the  rewrite  functions  to  be  introduced 
below,  are  also  main  operators.  In  the  rewrite  rules  below,  programming  language  symbols  will  be  main 
operators  only  of  proper  subexpressions.  Thus,  the  rewrite  functions  introduced  below  will  drive  the 
rewrite  rule  process. 

4.3  A  SIMPLE  ARITHMETIC  EXPRESSION  LANGUAGE 

An  example  will  illustrate  the  basic  approach.  Consider  the  simple  arithmetic  expression  language 
whose  syntax  is 

arithExpr  =  term 

I  arithExpr  arithOpr  term 
arithOpr  =»  ’-H’ | I’*’ I  7’ 
term  “  constant 

I ’(’ arithExpr ’)’ 

1  arithExpr 

There  are  no  variables  or  operator  precedence  in  this  language,  and  expressions  are  evaluated  left  to 
right  subject  to  parentheses. 

Let  us  assume  that  as  the  language  designers  of  this  little  language  we  decide  that  the  only  legal 
integer  values  an  arithmetic  expression  can  denote  are  those  of  an  unspecified  type  "Integer". 

For  a  given  expression  e  in  the  language,  we  would  like  to  know  when  e  can  be  executed  cleanly. 
As  we  shall  see,  this  can  be  expressed  as  a  function  of  the  program  text  of  e.  Let  us  call  this  function 
cleanE.  cleanE  accepts  the  text  of  an  "arithExpr"  and  returns  an  assertion: 

cieanE:  ArithExpr  —  >  Assertion 


In  order  to  express  the  assertion  cleanEC  e)  in  terms  of  assertions  about  subexpressions  of  e,  we 
shall  define  cleanE  recursively  based  on  the  grammar.  For  each  kind  of  arithmetic  expression,  we  shall 
give  a  rewrite  rule.  For  example,  the  rewrite  rule 


-19- 


cleanE('-’  arithExpr) 

cleanE(arithExpr)  &  arithExpr  IN  Integer 

defines  a  sufficient  condition  for  the  clean  execution  of  any  arithmetic  expression  of  the  form 
arithExpr.  Intuitively,  the  subexpression  "arithExpr"  must  execute  cleanly  and  the  result  must  in 
"Integer". 

The  identifier  arithExpr"  is  an  unbound  variable  in  the  above  rule.  Here,  as  in  subsequent  rules, 
an  unbound  identifier  is  implicitly  universally  quantified  over  all  values  of  the  corresponding  grammati¬ 
cal  type. 

The  complete  definition  of  cleanE  is  as  follows: 
cieanE;  ArithExpr  — >  Assertion 

cleanE(constant)  constant  IN  Integer 
cleanE(’(’  arithExpr  ’)’)  cieanE(arithExpr) 

cleanE(’-’  arithExpr) 

cleanE(arithExpr)  &  arithExpr  IN  Integer 

syntax:  pmt  -  ’-h’ | | 

cIeanE(arithExpr  pmt  term) 

cleanE(arithExpr)  &  cleanE(term) 

&  arithExpr  pmt  term  IN  Integer 

cIeanE(arithExpr  V’  term) 

cleanEfarithExpr)  &  cleanE(term) 

&  term  ^  0 

&  arithExpr  7’  term  IN  Integer 


Notation:  It  will  occasionally  be  convenient,  as  in  this  example,  to  define  additional  syntax  in 
order  to  group  alternatives  with  similar  clean  execution  rules.  This  can  easily  be  eliminated  by 
writing  a  rule  for  each  alternative.  A  non-terminal  so  defined  is  local  to  that  assertion  function. 
We  shall  emphasize  the  local  nature  of  the  non-terminal  by  constructing  its  name  rather  arbi¬ 
trarily  from  the  first  letters  of  each  alternative,  as  in  p(lus)m(inus)t(imes). 

With  these  rules  we  can  mechanically  derive  a  clean  execution  condition  for  a  given  arithmetic 
expression.  For  example,  the  clean  execution  condition  for  (77/2)*666  can  be  derived  as  follows: 

cleanE((77/2)*666) 

-> 

cleanE((77/2))  &  cIeanE(666)  &  (77/2)*666  IN  Integer 

-> 

cleanE(77/2)  &  666  IN  Integer  &  (77/2)*666  IN  Integer 

-> 

cIeanE(77)  &  cleanE(2)  &  25*^0  &  77/2  IN  Integer 
&  666  IN  Integer  &  (77/2)'666  IN  Integer 

-> 

77  IN  Integer  &  2  IN  Integer  &  2^0  &  77/2  IN  Integer 
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&  666  IN  Integer  &  (77/2)*666  IN  Integer 
This  assertion  must  then  be  verified  for  the  type  "Integer". 

4.4  WHAT  WE  HAVE  DONE  AND  THE  SHAPE  OF  THINGS  TO  COME 

Note  what  we  have  done  in  defining  cleanE.  We  have  provided  a  clear,  formal  definition  of  what 
clean  execution  means  in  our  arithmetic  expression  language  and,  in  doing  so,  have  provided  an  algo¬ 
rithm  for  generating  a  clean  execution  condition  for  a  given  program  contruct. 

The  basic  idea  in  specifying  clean  execution  is  simple:  an  operation  can  be  executed  cleanly  if  its 
operands  are  legal  and  if  the  operands  can  be  combined  legally  by  the  operator  involved.  In  the 
definition  of  cleanE  above,  the  recursive  invocations  of  cleanE  checked  that  the  operands  were  legal, 
and  the  propositions  involving  values  checked  that  they  were  combined  legally. 

As  a  result  of  our  view  of  programming  language  as  expression  language,  a  grammar  for  the 
language  is  quite  naturally  central  to  this  approach:  it  is  the  basis  for  the  recursive  definition  of  assertion 
functions  such  as  cieanE.  Essentially,  each  construct  in  the  language  corresponds  to  a  non-terminal. 
There  is  an  assertion  function  for  each  non-terminal  that  specifies  the  legality  of  the  corresponding  con¬ 
struct.  Each  assertion  function  is  defined  by  a  set  of  rewrite  rules,  where  each  rewrite  rule  expresses 
the  clean  execution  of  some  case  of  the  non-terminal  in  question  as  an  assertion  involving  assertion 
function  invocations  for  component  constructs. 

For  pure  LISP,  our  job  would  be  done.  For  other  than  purely  applicative  languages,  however, 
there  are  several  ways  in  which  the  simple  view  of  programming  language  as  expression  language  fails 
us.  As  with  any  purely  syntactic  method,  the  view  does  not 

(1)  treat  constructs  that  have  side-effects, 

(2)  associate  declared  information  with  its  uses,  and 

(3)  address  potentially  infinite  structures. 

We  shall  have  to  augment  the  straight  rewrite  rule  method  to  handle  these  problems. 

4.5  ON  ENFORCING  CLEAN  EXECUTION  CONDITIONS 

After  this  glimpse  at  the  rewrite  rule  formalism,  we  are  in  a  position  to  look  at  how  the  enforce¬ 
ment  method  influences  both  the  way  assertions  are  expressed  and  the  meaning  of  the  logical  connec¬ 
tives  in  our  assertion  language. 

Run-Time  Checking 

Let  us  first  consider  run-time  checking.  Although  a  clean  execution  condition  defines  what  condi¬ 
tions  must  be  checked  to  ensure  clean  execution,  an  implementation  may  check  a  clean  exe-  cution 
condition  in  a  variety  (and  even  a  mixture)  of  ways.  For  example,  an  implementation  may  be  able  to 
take  advantage  of  hardware  overflow  or  subscript  range  checking. 

How  a  particular  check  is  implemented  need  not  concern  us;  the  order  in  which  checks  are  made 
does.  For  example,  only  if  2p^0  does  it  make  sense  operationally  to  check  77/2  IN  Integer. 

More  generally,  legally  applying  an  operator  typically  depends  on  the  legality  of  its  operands.  For 
run-time  checking  the  assertions  produced  by  the  rewrite  rules  must  obey  such  ordering  constraints. 
We  must  insist  that  each  rewrite  rule  put  conjuncts  checking  that  the  operands  are  legal  before  con- 
juncts  checking  that  the  operands  are  legally  combined. 
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Also,  once  a  conjunct  is  found  to  be  false,  subsequent  conjuncts  need  not  be  legal,  as  we  saw 
above.  Thus,  for  run-time  checking  we  shall  consider  the  "and"  operation  to  be  conditional  [McCarthy 
63].  Although  not  all  logical  operations  need  be  conditional,  such  a  high  percentage  do  that  in 
evaluating  a  clean  execution  condition  as  a  run-time  check  we  shall  for  simplicity  assume  they  all  are. 
An  implementation  is  free  to  reorder  terms  as  long  as  the  result  can  be  legally  evaluated. 

For  arithmetic  operations  (and  comparisons)  we  must  further  ensure  that  no  run-time  check  is 
compromised  by  overflow  occurring  during  the  run-time  check  itself.  Thus,  the  rewrite  rules  for  arith¬ 
metic  operations  must  both  reflect  size  constraints  of  the  language  implementation  and  be  expressed 
quite  carefully.  For  example,  let  us  assume  that  "Integer"  is  the  contiguous  subrange  L..U  and  that 
operations  whose  operands  and  result  are  within  "Integer"  are  legal.  If  we  need  not  worry  about 
overflow,  the  addition  rule,  for  example,  can  be  expressed  simply  as 

cleanE(arithExpr  ’  +  ’  term) 

cleanE(arithExpr)  &  cleanE(term) 

&  L  ^  arithExpr  ’  +  ’  term  ^  U 

However,  if  overflow  can  occur,  such  rules  require  a  careful  enumeration  of  cases.  Let  us  assume  for 
convenience  that  the  implementation  uses  the  same  subrange  "Integer".  Then  the  addition  rule 
becomes 

cleanE(arithExpr  ’  +  ’  term) 

cleanE(arithExpr)  &  cieanE(term) 

&  ((arithExpr<0)  ^  (term<0)  /*  mixed  signs  */ 

1  0  ^  term  &.  arithExpr  ^  U  -  term  /*  both  -bve  */ 

1  term  <  0  &  L  -  term  ^  arithExpr)  /*  both  -ve  */ 

[Wortman  79]  explores  how  the  possibility  of  overflow  in  run-time  checks  complicates  the  statement  of 
clean  execution  conditions. 

Verifying  Clean  Execution  Conditions 

If  clean  execution  conditions  are  instead  to  be  verified,  we  have  none  of  these  problems  (assum¬ 
ing,  as  we  did  earlier,  that  the  assertion  language  uses  God’s  integers  and  not  IBM’s).  A  clean  execu¬ 
tion  condition  is  purely  an  algebraic  entity  that  is  "evaluated"  by  applying  axioms  of  the  underlying 
theories.  Thus,  b?^0  &  a/b  IN  Integer  is  the  same  as  a/b  IN  Integer  &  b  -^0,  and  it  makes  no  sense 
to  talk  of  conditional  evaluation. 

At  first  glance  it  might  seem  that  when  clean  execution  conditions  are  to  be  verified,  the  assertions 
generated  are  unnecessarily  long.  Clearly,  if  a/b  IN  Integer,  then  it  must  also  be  true  that 
cleanE(a)  &  cleanE(b)  &  b;*^0.  If  b=-0,  however,  a/b  IN  Integer  cannot  be  reduced  to  either  TRUE 
or  FALSE. 

A  program  either  executes  cleanly  or  it  does  not.  We  thus  would  like  the  rewrite  rules  to  have  the 
property  that  a  clean  execution  condition  can  always  be  reduced  to  either  TRUE  or  FALSE.  Keeping 
the  more  verbose  rules  for  verification  ensures  that  this  can  be  done. 
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5.  THE  CLEAN  EXECUTION  OF  AN  EXAMPLE  LANGUAGE 

In  this  and  the  next  two  chapters  we  develop  the  method  for  defining  clean  execution  that  is  at  the 
heart  of  this  thesis.  We  proceed  largely  by  example,  using  rewrite  rules  to  specify  clean  execution  con¬ 
dition  rules  for  the  constructs  of  a  block-structured  language.  The  example  language  is  purely  illustra¬ 
tive:  language  design  decisions  are  sometimes  made  (and  re-made)  purely  to  explore  the  corresponding 
clean  execution  rules. 


We  shall  define  the  clean  execution  of  our  example  language  bottom-up:  first  expressions,  then 
statements,  and  then  declarations.  This  chapter  deals  with  those  constructs  whose  clean  execution  can 
be  defined  purely  with  rewrite  rules.  This  does  not  include  two  sets  of  constructs: 

(Da  composition  operator  to  compose  programs  from  declarations  and  statements,  taking 
scoping  into  account,  and 

(2)  parameterized  declarations:  procedure  and  function  declarations,  loops,  and  parameterized 
types. 

Chapters  6  and  treat  7  these  two  sets  of  constructs,  respectively. 


5.1  ARITHMETIC  ANi 


H  iA  1 


Let  us  return  to  the  arithmetic  expression  grammar  above  and  define  different  clean  execution 
rules.  Rather  than  restricting  all  subexpressions  to  be  in  "Integer",  we  shall  only  constrain  those  we 
intend  to  operate  on.  That  is,  we  can  write  large  constants;  we  just  can’t  add  or  subtract  them.  cleanE 
now  is  defined  as: 


cleanE:  ArithExpr  — >  Assertion 
cleanE(constant)  TRUE 

cleanE('(’  arithExpr  ’)’)  cleanE(arithExpr) 

cleanE(’-’  arithExpr) 

legalOperand(arithExpr)  &  arithExpr  IN  Integer 

syntax:  pmt  =*  ’-i-’  |  | 

cleanE(arithExpr  pmt  term) 

legalOperand(arithExpr)  &  legalOperand(term) 

&  arithExpr  pmt  term  IN  Integer 

cleanE(arithExpr  ’/’  term) 

legaiOperand(arithExpr)  &  legalOperand(term) 

&  term  0 

&  arithExpr  ’/’  term  IN  Integer 

legalOperand:  ArithExpr  —  >  Assertion 

legalOperand(arithExpr) 

cleanE(arithExpr)  &  arithExpr  IN  Integer 

Although  this  arithmetic  expression  language  has  the  same  grammar  as  the  earlier  one.  the  language  is 
different.  Rewrite  rules  provide  a  precise  way  of  describing  the  differences. 
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No  case  analysis  need  be  done  for  legalOperand.  We  could  instead  have  substituted  the  definition 
of  legalOperand  for  every  instance  of  its  use  in  the  definition  of  cleanE,  thus  eliminating  legalO- 
perand.  The  arguments  for  and  against  creating  a  separate  function  are  the  same  as  in  other  types  of 
programming.  Similarly,  a  "legalResult"  function  might  have  been  defined. 

Another  language  construct  where  rewrite  rules  provide  a  natural,  precise  way  of  specifying 
language  design  decisions  is  Boolean  expressions.  The  language  designer  could  specify  that  both 
operands  of  a  binary  operator  must  always  be  evaluable  by; 

cleanE(expr  and  t\px\) 

cleanE(expr)  &  cleanE(exprl) 
cleanECexpr  or  exprl) 

cleanE(expr)  &  cleanE(exprl) 

or  that  the  second  operand  is  to  be  evaluated  only  if  need  be  (as  is  the  case  for  the  conditional  opera¬ 
tors  of  the  assertion  language): 

cleanE(expr  exprl) 

cleanE(expr)  &  (-■  expr  1  cleanE(exprl)) 
cleanE(expr  or  exprl) 

“*■  cleanE(expr)  &  (expr  |  cieanE(exprl)) 


Notation'.  By  convention,  an  unbound  identifier  will  always  be  the  same  identifier,  with  possibly  a 
single  decimal  digit  added,  as  the  corresponding  non-terminal.  This  convention  spares  us  declar¬ 
ing  and  explicitly  quantifying  such  identifiers.  A  single  decimal  digit  can  be  appended  to  an 
identifier  to  produce  new  identifiers  of  the  same  syntactic  type.  This  is  required,  as  in  the  above 
rule,  to  distinguish  between  different  instances  of  a  non-terminal  in  an  ambiguous  rule. 

5.2  SIMPLE  VARIABLE  REFERENCES 

Let  us  add  simple  variables  of  type  "Integer"  to  the  arithmetic  expression  language  just  defined; 

term  =  ...  1  varRef 
varRef  =  varld 

To  ban  references  to  uninitialized  variables,  we  shall  specify 
cleanE(varRen  varRef  IN  Integer 

That  is,  if  we  can  verify  varRef  IN  Integer,  we  can  conclude  that  "varRef  has  been  initialized. 

This  rule  is  limited  to  "Integer"  variables.  To  construct  a  more  general  rule,  we  need  a  way  of 
associating  the  corresponding  type  with  a  variable  reference.  We  shall  denote  the  type  of  a  variable 
reference  as  ItsType(varReD.  We  can  then  write: 

cleanE(varRef)  varRef  IN  ItsType(varRef) 

or,  to  identify  the  property  being  asserted; 
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cleanE(varReD  initialized(varRef) 

initialized:  VarRef  —  >  Assertion 

initialized(varRef)  varRef  IN  ItsType(varRef) 


We  still  must  define  ItsType( varRef),  which  is  an  attribute  of  "varReP  that  must  be  derived  from 
the  associated  declaration. 

5.3  STRUCTURED  VARIABLE  REFERENCES 

The  notation  ItsType( varRef)  also  allows  a  natural  statement  of  a  clean  execution  rule  for  array 
references.  Let  us  add  singly  dimensioned  arrays  subscripted  by  arithmetic  expressions  to  the  language: 

varRef  =■  varld 

I  varld  ’[’  arithExpr 

We  shall  assume  that  we  can  always  check  statically  that  scalar  and  array  identifiers  are  properly  used. 
(Recall  that  we  earlier  assumed  that  statically  enforceable  constraints  have  already  been  enforced.)  To 
include  array  references,  we  must  replace  the  above  clean  execution  condition  for  variable  references 
with 


cleanE(varRef) 

validRef(varRef)  &  iiiitialized(varRef) 

validRef:  VarRef  — >  Assertion 

validRef(varld)  TRUE 

validRef( varld  ’[’  arithExpr  ’]’) 
cleanE(arithExpr) 

&  arithExpr  IN  IndexType(ItsType(varId)) 

IndexType(ItsType( varld)),  like  ItsType(varReO,  must  be  derived  from  the  associated  declaration. 

The  clean  execution  rule  for  variable  references  is  quite  intuitive:  a  variable  reference  can  be 
legally  evaluated  if  it  properly  references  a  variable  and  if  that  variable  has  been  initialized.  Whatever 
their  intuitive  meanings,  however,  it  is  important  to  remember  that  cleanE,  validRef,  and  initialized 
are  just  functions  of  program  text  whose  meaning  is  defined  by  the  rewrite  rules  given. 

Since  references  to  fixed  record  components  can  be  checked  statically,  adding  such  references  to 
this  definition  is  easy: 

varRef  *  ...  |  varld  fieldid 

No  changes  to  cleanE  or  initialized  are  required;  to  validRef  we  must  add: 


validRefC varld  fieldid) 


TRUE 
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Thus  far,  array  elements  have  been  scalars;  we  could  instead  allow  a  record  or  array  component  to 
be  an  arbitrary  variable: 

varRef  —  varld 

1  varRef  ’['  arithExpr 
I  varRef  fieldid 

The  definition  of  validRef  is  correspondingly  recursive: 

•  validRef:  VarRef  — >  Assertion 

vaiidRef(varld)  TRUE 

vaiidRef(varRef  arithExpr  ’]’) 

“*  validRef(varRef)  &  cleanE(arithExpr) 

&  arithExpr  IN  IndexTyp€(ItsTyp€( varRef)) 

vaiidRef(varRef  fieldid)  validRef(varRef) 


5.4  DERIVED  AND  DECLARED  ATTRIBUTES 

ItsType(varRef)  and  IndexType(ItsType(varRef))  are  attributes  of  "varReC  and  its  type  respec¬ 
tively.  Attributes,  which  denote  information  from  the  named  declarations,  play  a  central  role  in 
defining  clean  execution. 

Attributes  of  types,  variables,  and  constants,  called  standard  components,  play  a  prominent  role  in 
the  programming  language  Euclid.  Euclid’s  standard  components  include  declared  properties,  such  as 
the  type  of  a  variable  or  constant,  the  base  type  of  a  set  type,  and  the  first  element  in  an  enumerated  or 
subrange  type,  as  well  as  execution-time  properties,  such  as  the  current  value  of  the  tag  of  a  variant 
record  and  the  successor  of  a  value  in  an  enumerated  or  subrange  type. 

Standard  components  allow  algorithms  based  on  the  attributes  of  objects  to  be  expressed  directly  in 
terms  of  these  attributes.  Otherwise,  the  programmer  must  maintain  this  information  explicitly,  which 
requires  additional  variables  or  parameters  and  assignments  or  function  invocations  and  thus  often 
makes  the  program  harder  to  understand  and  maintain. 

Attributes  such  as  ItsType  and  IndexType  offer  similar  advantages.  However,  the  programmer  is 
provided  standard  components  by  the  language;  the  language  designer  must  provide  attributes  himself. 

We  shall  distinguish  two  kinds  of  attributes  depending  on  how  they  are  provided.  An  attribute 
defined  as  a  rewrite  function  is  derived.  An  attribute  defined  by  a  substitution  function  based  on  the 
identified  declaration  is  declared. 

IndexType,  for  example,  is  a  derived  attribute.  After  ItsType(varRef)  has  been  replaced  by  the 
type  of  "varRef  and  any  type  names  have  been  replaced  with  their  corresponding  definitions,  Index- 
Type  can  be  defined  as  a  rewrite  function: 

IndexType:  StructuredType  — >  Type 

IndexType(arra>’  indexType  o/componentType) 
indexType 
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(The  similarity  between  IndexType  and  "indexType"  is  a  mnemonic  coincidence;  one  is  a  rewrite  func- 
‘tion,  and  the  other  a  language  construct.) 

ItsType(varRef)  applies  to  components  of  structured  variables  as  well  as  entire  variables.  To  sim¬ 
plify  the  substitution  function,  we  shall  define  ItsType(varRef)  in  terms  of  the  type  of  a  declared  vari¬ 
able  TYPE<varId>  only: 

ItsType:  VarRef  —  >  Type 

ItsType(varld)  TYPE<varId>  . 

ItsTypefvarRef  ’[’  expr  ’]’) 

ComponentType(ItsType(varRef)) 

ItsType(varRef  fieidid) 

RecordElementType(fieldId,  ItsType(varRef)) 

ItsType(varRef)  is  thus  a  derived  attribute.  TY?E<varId>  is  a  declared  attribute,  and  there  must  be 
a  corresponding  substitution  function  to  replace  it  with  the  type  of  "varld",  as  we  shall  discuss  later. 
The  derived  attributes  ComponentType  and  RecordElementType  are  defined  similarly  to  IndexType: 

ComponentType:  StructuredType  —  >  Type 

ComponentType( array'  indexType  o/componentType) 

ComponentType 

RecordElementType:  Fieldid  x  StructuredType  — >  Type 

RecordElementType(fieldld,  recori/ fieldid  1  typel 

fieldld2  type2}  end) 

—  IF  IDEQ(fieldId,  fieldidl) 

THEN  typel 

ELSE  RecordElementType(fieldId,  record 
fieldld2  type2}  end)  FI 

checked  statically,  "fieldid"  is  a  valid  component  of  the  specified  record. 


Notation:  We  shall  use  a  generic  IF  expression: 

IF  assertion  THEN  fribble  [ELSE  frijablel]  FI 

whose  meaning  should  be  clear.  Rewrite  rules  are  strongly  typed,  the  types  of  the  arguments  of  a 
rewrite  function  can  only  be  of  the  proper  grammatical  types.  Thus,  to  be  precise,  a  separate 
name  and  definition  are  required  for  each  type  of  "fribble"  used  as  the  second  and  third  argu¬ 
ments  to  the  IF  function,  such  as  for  "type"  above. 

In  the  special  case  when  the  second  and  third  arguments  are  assertions,  the  IF  expression 

IF  assertion  THEN  assertionl  ELSE  assertion2  FI 


could  be  expressed  instead  as 


(assertion  &  assertionl)  1  (-  assertion  &  assertion2) 


Notatiorr.  As  in  RecordElementType,  assumptions  about  what  has  been  checked  statically  will  be 
noted  where  important.  These  assumptions  are  only  comments  and  are  not  part  of  our  formal¬ 
ism. 

We  still  must  replace  declared  attributes  (and  declared  types)  with  their  corresponding  definitions. 
Moreover,  since  interactions  of  scoping  with  other  language  features  can  result  in  execution  errors,  we 
need  a  rigorous  way  of  doing  so.  How  this  is  done  is  the  principal  subject  of  the  next  chapter. 

5.5  STATEMENTS 

Since  cleanE  applies  to  expressions  and  not  to  statements,  we  need  an  assertion  function  for 
"stmt"s: 

cleanSt;  Stmt  — >  Assertion 


Notation:  As  mentioned  above,  most  rewrite  functions  express  the  clean  execution  of  a  construct 
in  terms  of  the  clean  execution  of  component  constructs.  Thus,  we  shall  need  a  family  of 
rewrite  functions  each  of  which  yields  the  clean  execution  condition  of  a  whatever: 

clean W;  Whatever  -->  Assertion 

where  W  is  an  acronym  for  a  whatever.  (The  "W"  is  usually  not  pronounced.)  We  could  use  a 
generic  function  clean,  as  was  done  with  the  IF  function  above.  We  shall  often  need  to  refer  to 
a  particular  instance  of  clean,  however,  and  this  naming  convention  provides  a  concise  way  of 
doing  so. 

We  have  already  developed  two  of  the  three  pieces  needed  to  define  the  clean  execution  of  the 
assignment  statement: 

cleanSt(varRef  expr) 

validRef(varRef)  &  cleanE(expr) 

&  asgmtTC(expr,  ItsTyp€(varRef)) 

We  shall  define  asgmtTC  (assignment  type  compatibility)  when  we  give  clean  execution  rules  for  func¬ 
tion  and  procedure  type  compatibility  below. 

A  clean  execution  rule  for  if-then-else  statements  is  straightforward.  Let  us  assume  that  the 
expression  used  can  be  checked  statically  to  be  a  Boolean  value.  Consider  the  rule  for  Pascal  s  condi¬ 
tional: 


cleanSt(//expr  then  stmt  [else  stmtll) 
“*■  cleanE(expr) 

&  (expr  &  cleanSt(stmt) 

I”'  expr  [Si.  cleanSt(stmtl )]) 


Notatiorr.  As  in  this  rule,  the  grammatical  convention  of  enclosing  an  optional  element  in  brack¬ 
ets  has  a  natural  analog  in  our  rewrite  rules.  Corresponding  to  an  optional  element  on  the  left- 
hand  side  of  a  rule  may  be  a  clause  in  the  assertion  on  the  right-hand  side  of  the  rule.  This 
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analogous  convention  adds  no  power  to  the  rewrite  rules  because  the  grammatical  convention 
adds  no  power  to  the  grammatical  formalism:  the  grammar  can  mechanically  be  rewritten  to 
eliminate  bracketed  terms  [Thompson  77].  After  translating  the  grammar,  there  would  be  no 
need  for  the  rewrite  rule  convention. 

Another  grammatical  convention  that  has  a  natural  rewrite  rule  analog  is  the  use  of  braces  to 
indicate  zero  or  more  of  the  enclosed  element.  Corresponding  to  each  of  the  arbitrary  number 
of  elements  on  the  left-hand  side  of  a  rule  is  zero  or  more  parts  of  an  assertion  on  the  right-hand 
side  of  the  rule.  Each  such  part  is  enclosed  in  braces.  Usually  only  one  part  is  needed;  occasion¬ 
ally,  a  second  is  needed,  typically  to  balance  parentheses,  as  in  the  next  rule.  Again,  no  power  is 
added  to  the  rewrite  rules. 

If  we  add  elseif  clauses  (as  in  Algol  68,  Euclid,  or  Modula),  then  the  order  of  evaluation  drives  the 
creation  of  the  rule: 

cleanStC// expr  then  stmt 

[elseif  exprl  then  stmtl } 

[else  stmt2]) 
cleanE(expr) 

&  (expr  &  cleanSt(stmt) 

I  -•  expr  {&  cieanE(expri) 

&  (exprl  &  cleanSt(stmtl) 

1 -•  exprl)  [&  cleanSt(stmt2)]  {)}  ) 

The  rule  for  case  statements  would  follow  along  similar  lines. 

Begin  blocks  provide  a  means  of  delimiting  statement  sequences: 
cleanSt(^g/>7  stmt  end)  cleanSt(stmt) 

or,  alternatively,  of  delimiting  scopes: 

cleiLnSiibegin  sco^Q  end)  — ^  cleanSc(scope) 

We  shall  treat  statement  sequences  as  a  special  case  of  statements  in  the  next  chapter. 

Because  of  its  recursive  nature,  the  clean  execution  rule  for  loops  is  based  on  that  for  procedure 
declarations  and  is  developed  later. 

TYPE  INCOMPATIBILITY 

Type  incompatibility  is  an  important  source  of  execution  errors.  Assignment  compatibility  can  be 
straightforward  to  formulate,  as  for  example: 

asgmtTC:  Expr  x  Type  — >  Assertion 
asgmtTC(expr,  type)  expr  IN  type 


Let  us  consider  the  execution  problems  of  procedure  invocations.  Procedures  typically  can  have 
both  value  and  variable  parameters: 
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procDcln  *  proc  procld  [’(’  procParmList  ’)’] 
routineBody 

procParmList  *  [var]  parmid  type 

[var]  parmid  type} 

For  most  languages,  however,  there  is  no  information  at  the  point  of  call  as  to  whether  a  variable  refer¬ 
ence  corresponds  to  a  value  or  variable  parameter.  (This  is  a  problem  for  any  formalism;  as  program¬ 
mers  we  can  just  look  at  the  parameter  list  in  the  procedure  declaration.)  Care  must  be  taken  in  using 
the  same  "check  the  parts,  check  how  they  are  combined"  paradigm.  Both  checks  (checking  that  the 
arguments  are  legal  and  checking  that  the  arguments  are  type  compatible  with  their  corresponding  for¬ 
mal  parameters)  depend  on  the  nature  of  the  parameters.  We  shall  let  validParms  sort  this  out: 

cieanSt(procId  (’(’  exprList  ’)’]) 

—  TRUE 

[&  validParras(exprList,  PARMS<  procId>)] 

PARMS<  procld>  is  a  declared  attribute:  the  parameter  list  associated  with  the  procedure  "procid". 

validParms  checks  that  arguments  corresponding  to  value  parameters  are  valid  values,  arguments 
corresponding  to  variable  parameters  are  valid  references,  and  all  arguments  are  type  compatible. 
yalidParms  deals  with  the  parameter  list  expressed  recursively: 

parmList  *  [var]  parmid  type 

I  [var]  parmid  type  parmList 

After  PARMS<procld>  has  been  replaced,  validParms  is  straightforward: 
validParms:  ExprList  x  ParmList  — >  Assertion 

validParms(expr,  parmid  type) 

cleanE(expr)  &  valueParmTC(expr,  type) 

validParms(expr,  var  parmid  type) 

validRef(expr)  &  varParmTC(expr,  type) 

vaIidParms(expr  exprList, 

parmid  type  procParmList) 
cleanE(expr)  &  valueParmTC(expr,  type) 

&  validParms(exprList,  procParmList) 

validParms(expr  exprList, 

var  parmid  type  procParmList) 
validRef(expr)  &  varParmTC(expr,  type) 

&  validParms(exprList,  procParmList) 

In  the  second  and  fourth  rewrite  rules,  the  "expr"  is,  more  specifically,  a  variable  reference. 

There  is  an  implicit  assumption  in  the  definition  of  validParms,  that  the  expression  and  type  lists 
are  of  the  same  length.  For  our  purposes  here,  we  shall  assume  that  there  is  a  context-dependent  con¬ 
straint  to  this  effect.  If  not,  validParms  would  have  to  be  made  more  complex  to  handle  this. 
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We  still  must  define  value  and  variable  type  compatibility.  We  could  decree,  for  example,  that  type 
compatibility  for  value  parameters  means  that  the  argument  is  assignable  to  the  formal: 

valueParmTC:  Expr  x  Type  — >  Assertion 

valueParmTC(expr,  type) 
asgmtTC(expr,  type) 

and  that  type  compatibility  for  variable  parameters  means  that  the  types  of  formal  and  arguments  match 
exactly: 

yarParmTC:  VarRef  x  Type  — >  Assertion 

varParmTCCvarRef,  type) 

TypeEquiv(ItsType(varRef),  type) 

Type  equivalence  would  then  have  to  be  defined. 

In  addition  to  their  role  in  defining  clean  execution  conditions,  assertion  functions  such  as  var- 
PannTC  and  TypeEquiv  serve  a  larger  purpose;  they  formalize  very  important  programming  language 
concepts  that  heretofore  have  been  defined  only  informally. 

Let  us  now  consider  the  type  compatibility  problems  of  function  invocations,  and  let  us  restrict  our 
attention  to  pure  functions,  ones  that  do  not  modify  their  parameters  or  their  environments.  We  shall 
confront  the  problems  of  side-effects  for  statement  composition.  The  same  techniques  could  be  applied 
here,  but  it  will  simplify  the  exposition  (not  to  mention  the  proof  rules  needed!)  if  we  delay  the  con¬ 
frontation. 

Since  function  parameters  are  a  restricted  form  of  procedure  parameter: 

term  =*  ...  |  fcnid  (’(’  exprList  ’)’] 
fcnDcln  *  fen  fcnid  [’(’  fenParmList  ’)’] 

remrns\2j\d  type  ’=*’  routineBody 
fenParmList  —  parmid  type  {\’  parmid  type} 

the  function  invocation  rule  follows  from  its  procedure  counterpart: 

cleanE(fcnId  [’(’  exprList  ’)]) 

— ^  TRUE  [&  validPanns(exprList,  PARMS<fcnId>)] 

PARMS<fcnId>  is  the  parameter  list  associated  with  the  function  "fcnid". 

VARIABLE,  CONSTANT,  AND  TYPE  DECLARATIONS 

From  the  point  of  view  of  proving  program  properties,  declarations  are  assertions.  A  type  declara¬ 
tion,  for  example,  asserts  that  objects  of  the  type  will  take  on  only  certain  values  and  be  manipulated 
only  by  certain  operators.  A  constant  declaration  asserts  that  the  constant  will  obey  the  rules  of  the 
type  and  is  assigned  a  value  only  once.  Having  ensured  that  declarations  are  maintained,  an  implemen¬ 
tation  can  use  the  asserted  properties  in  establishing  other  properties. 

Declarations  are  also  executable  actions.  Objects,  types,  and  routines  may  variously  require  storage 
to  be  allocated,  initializations  done,  and  names  associated. 
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The  same  check  the  parts,  check  how  they  are  combined"  paradigm  works  for  declarations  as  for 
expressions  and  statements.  The  problems  of  bounds  in  type  definitions  being  properly  defined,  initial 
values  of  constants  and  variables  being  type  compatible,  return  parameters  being  assigned  a  proper 
value,  and  function  and  procedure  bodies  being  cleanly  executable  fit  within  this  paradigm. 

Adding  declarations,  we  can  talk  about  scopes  and  programs: 

program  =*  scope 
scope  —  {dcln  stmt 
dcln  =■  varDcln  |  consDcln  |  typeDcln 
I  fcnDcln  |  procDcln 
varDcln  =»  vorvarld  type 
consDcln  *  cons  consid  type  expr 
typeDcln  =■  lype  typeld  ’=*’  type 


For  languages  where  constants  in  declarations  are  statically  defined  (known  at  compile-time),  the 
clean  execution  rules  for  variable,  constant,  and  type  declarations  are  uninteresting: 

cleanD(varDcln)  TRUE 
cleanD(consDcln)  TRUE 

cleanD(  typeDcln)  TRUE 

For  languages  where  constants  are  not  necessarily  bound  until  scope  entry,  the  clean  execution  of  vari¬ 
able  and  type  declarations  reduces  to  that  of  types: 

cleanDCvor  varld type)  cleanT(lype) 

cleanDfiype  typeld type)  cleanT(type) 

The  clean  execution  of  constant  declarations  also  requires  assignment  compatibility: 

cIeanD(coA;s  consid  type  expr) 

— ^  cleanT(type)  &  cleanE(expr) 

&  asgmtTCCexpr,  type) 


The  same  rules  would  apply  if  the  form  of  declarations  allowed  a  list  of  identifiers  to  be  declared. 
For  ease  of  explanation,  we  shall  restrict  ourselves  throughout  this  thesis  to  declarations  of  a  single 
identifier.  This  restriction  applies  also  to  identifiers  declared  as  formal  parameters.  Declaring  more 
than  one  identifier  with  the  same  attributes  requires  successive  declarations.  The  rules  for  declarations 
generalize  in  a  straightforward,  if  occasionally  cumbersome,  way  to  lists  of  identifiers. 

We  still  must  define  cleanT.  The  most  common  source  of  execution  errors  in  type  declarations  is 
in  expressions  used  in  bounds.  Consider  the  following  types: 

type  *  simpleType  |  structuredType 
simpleType  =  ’Integer’  |  ’Boolean’ 

I  ’(’  enumeratedld  {’,’  enumeraledld)  ’)’ 

I  cons  ’..’  cons 
I  namedType 

StructuredType  *  array  simpleType  q/  type 
I  record  ’:’  type 

{’,’  fieldid  ’:’  type)  end 
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I  namedType 
namedType  “  typeld 


Built-in  and  enumerated  types  present  no  execution  problems: 

cleanK ’Integer’)  — ^  TRUE 

cleanT(’Boo!ean’)  TRUE 

cleanT(’(’  enumeratedid  {’,’  enumeratedidl}  ’)’) 

^  TRUE 

but  not  so  subrange  types: 

cleanT(cons  consl) 

cleanE(cons)  &  cleanE(consl) 

There  could  also  be  an  ordering  constraint  on  the  bounds  of  subrange  types,  such  as: 

cieanT(cons  consl) 

cleanE(cons)  &  cleanE(consl) 

&  cons  <  consl 


In  our  example  language  structured  types  present  only  the  problems  of  component  types: 

cleanT(arrfl>' simpleType  o/type) 

cleanT(simpleType)  &  cleanT(type) 
cleaiiT(recorcy  fieldid  ’:’  type 

{’,’  fieldidl  typel}  end) 
cieanT(type)  {&  cleaiiT(typel)} 


Although  the  definition  of  a  named  type  may  result  in  an  execution  error,  references  to  the  type 
cannot: 

cleanT(namedType)  TRUE 

In  Chapter  7  we  shall  discuss  parameterized  type  definitions,  references  to  which  do  present  execution 
problems. 

.  SUMMARY 

In  this  chapter  we  have  defined  the  clean  execution  of  much  of  a  Pascal-like  language  using  rewrite 
rules  organized  as  assertion  functions.  The  definition  is  recursive,  based  on  the  grammar.  Each  asser¬ 
tion  function  corresponds  to  some  non-terminal  of  the  programming  language,  translating  a  program¬ 
ming  language  expression  of  the  appropriate  syntactic  type  to  an  assertion  language  expression.  Key  to 
specifying  these  assertion  functions  are  declared  attributes  of  variables,  constants,  types,  and  routines, 
such  as  the  type  of  a  variable  identifier.  Although  we  have  yet  to  provide  these  declared  attributes, 
they  illustrate  the  advantages  of  properly  chosen  intermediate  abstractions  in  decomposing  a 
specification. 
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To  complete  our  artificial  language,  we  need  two  sets  of  constructs: 

(1)  a  composition  operator  to  compose  programs  from  declarations  and  statements,  taking 
scoping  into  account,  and 

(2)  parameterized  declarations:  procedure  and  function  declarations,  loops,  and  parameterized 
types. 

The  next  two  chapters  deal  with  these  two  sets  of  constructs,  respectively. 
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6.  CLEAN  EXECUTION  OF  STATEMENT  SEQUENCES  AND  SCOPES 

6.1  COMPOSING  STATEMENTS  WITH  STATEMENTS 

Until  now  we  have  considered  the  clean  execution  of  a  construct  quite  apart  from  its  meaning. 
With  statement  sequences 

stmtList  =■  stmt  stmtList] 

this  is  not  possible,  whether  a  statement  can  execute  cleanly  is  dependent  on  the  transformation 
effected  by  the  previous  statement(s).  For  example,  in 

X  ;  2; 

y  (a  +  b)/x 

the  first  statement  ensures  that  there  is  no  division  by  zero  in  the  second  statement. 

Although  we  cannot  describe  the  clean  execution  of  sequential  statements  within  our  rewrite  rule 
formalism,  we  can  do  so  using  a  formalism  for  defining  the  meaning  of  programs,  such  as  Dijkstra’s 
weakest  precondition  [Dijkstra  76]: 

cleanSt(stmt  stmtList) 

deanSt(stmt)  &  wp<stmt,  cleanSt(stmtList)  > 

wp<stmt,  cleanSt(stmtList)  >  is  the  clean  execution  condition  for  "stmtList"  before  "stmt"  is  to  be 
executed. 

Notatioir.  We  shall  use  angle  brackets  to  enclose  the  arguments  of  operators  that  are  not  rewrite 
functions.  This  distinguishes  such  operators  and  makes  reading  clean  execution  conditions 
easier. 

Especially  since  we  cannot  completely  separate  clean  execution  from  meaning  for  statement 
sequences,  we  might  want  to  consider  the  clean  execution  of  statements  not  as  a  separate  property,  but 
as  a  part  of  total  correctness.  We  could  then  talk  in  terms  of  clean  termination  instead  of  termination. 
The  extension  of  Dijkstra’s  weakest  precondition,  for  example,  to  a  weakest  clean  precondition  would 
be  straightforward: 

wcp<stmt,  assertion > 

“  cleanSt(stmt)  &  wp<stmt,  assertion > 

As  we  saw  in  the  previous  chapter  and  as  this  formula  indicates,  the  clean  execution  condition  of  a 
statement  is  a  precondition  that  depends  only  on  the  form  of  the  statement  and  not  on  the  context  in 
which  the  statement  is  used. 

It  is  important,  however,  that  this  research  not  be  tied  to  any  particular  way  of  defining  the  mean¬ 
ing  of  programs.  First,  and  most  importantly,  what  constitutes  a  legal  program  is  conceptually  separate 
from  what  a  legal  program  means.  Second,  including  clean  execution  in  the  same  formalism  as  the 
meaning  of  programs  would  greatly  increase  the  number  of  trivial  assertions  to  be  proved  in  verifying 
correctness  properties.  Moreover,  if  two  correctness  properties  were  verified  separately,  the  verification 
of  clean  execution  would  have  to  be  duplicated.  We  shall  assume  only  that  some  definition  of  the 
meaning  of  programs  is  available.  We  need  some  method  for  illustration,  and  most  often  use  predicate 
transformers. 
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. No!  All  Free  Identifiers  are  Free 

Most  axiom  systems  for  proving  correctness  properties  rely  on  a  pun:  a  variable  identifier  also 
denotes  its  value.  The  assignment  axiom,  for  example,  substitutes  an  expression  for  free  occurrences 
of  a  variable  name.  In  evaluating  the  truth  of  an  assertion,  on  the  other  hand,  a  variable  identifier 
denotes  its  value. 

This  dual  role  of  identifiers  causes  no  problems  in  proving  correctness  properties  because  correct¬ 
ness  properties  are  expressed  only  in  terms  of  values.  In  deriving  a  clean  execution  condition,  how¬ 
ever,  we  cannot  always  safely  exploit  this  pun.  For  example,  TYPE<x>  and  validRef(x)  depend  on 
the  structure  of  "x"  and  not  its  value.  In  initialized(x),  "x"  is  a  name;  after  rewriting  it  as  x  IN 
ItsType(x),  "x"  first  denotes  a  value  and  then  is  a  name.  (An  identifier  denotes  a  value  when  it 
appears  by  itself;  it  is  a  name  when  used  as  an  argument  to  a  rewrite  rule  function  or  to  a  declared  attri¬ 
bute.) 

No  complications  arise  in  the  pure  rewrite  rule  part  of  our  formalism.  It  is  only  when  we  introduce 
a  mechanism  for  defining  the  meaning  of  programs,  as  in  the  statement  sequence  rule,  that  problems 
arise.  If  the  transformation  specified  is  defined  by  a  substitution  for  a  variable  identifier,  that  substitu¬ 
tion  should  apply  only  to  occurrences  of  the  identifier  that  denote  values. 

The  solution  is  to  distinguish  between  names  and  values.  In  a  strict  sense,  however,  we  cannot 
distinguish  between  names  and  identifiers  using  rewrite  rules  because  we  cannot  construct  new  names. 
A  free  identifier  on  the  left-hand  side  of  a  rewrite  rule  either  appears  on  the  right-hand  side,  in  which 
case  it  is  still  free,  or  does  not,  in  which  case  it  can  never  be  recreated.  Musser  [78]  uses  extended 
names  to  distinguish  different  uses  of  names  (operators)  in  DTVS.  German  [78]  italicizes  identifiers 
used  as  names,  taking  care  to  replace  these  uses  before  invoking  the  verifier. 

To  distinguish  between  names  and  values,  we  shall  (1)  constrain  the  order  of  evaluation  so  that 
weakest  preconditions  are  not  evaluated  until  there  are  no  rewrite  rules  that  apply  to  the  subject  asser¬ 
tion,  and  (2)  substitute  only  for  identifiers  that  are  not  arguments  of  rewrite  functions  or  declared  attri¬ 
butes.  As  we  shall  see  in  the  example  at  the  end  of  the  chapter,  these  constraints  are  quite  natural. 

6.2  COMPOSING  DECLARATIONS  WITH  SCOPES 

When  we  defined  the  clean  execution  of  declarations,  we  dealt  with  them  as  executable  actions, 
ignoring  their  principal  role  as  assertions.  When  we  defined  the  clean  execution  of  expressions  and 
statements,  however,  we  relied  on  asserted  properties  of  declarations  in  the  form  of  derived  and 
declared  attributes. 

We  need  a  rigorous  way  of  associating  declared  information  with  its  uses  so  that  we  can  check  that 
declarations  are  satisfied.  Especially  since  scoping  interacts  with  other  language  features,  this  must  take 
scoping  into  account. 

To  do  this,  we  shall  introduce  a  mechanism  outside  the  rewrite  rules,  substitution  functions.  A  sub¬ 
stitution  function  replaces  each  occurrence  within  an  assertion  of  a  specified  name  with  the  correspond¬ 
ing  declared  information.  The  name  is  either  a  single  identifier  or  a  declared  attribute.  There  is  a  sub¬ 
stitution  function  for  each  declared  attribute  of  each  declaration.  (Substitution  functions  can  only  be 
applied  to  irreducible  clean  execution  conditions.) 

Substitution  functions  could  also  be  defined  with  rewrite  rules  if  we  allowed  assertions  in  the 
domains  of  rewrite  functions.  However,  this  would  require  a  great  many  rewrite  rules  to  effect  the  sub¬ 
stitutions  and  would  greatly  complicate  the  proof  of  convergence  of  the  rewrite  rules. 
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Notaiiorr.  All  substitution  functions  have  signatures  of  the  form: 

WSub:  Whatever  x  Whatever  x  Assertion  — >  Assertion 

As  in  naming  cieanW  functions,  we  shall  name  substitution  functions  by  choosing  W  to  be  an 
acronym  for  a  "whatever". 

Scoping  of  Identifiers 

By  the  decision  that  a  substitution  functions  makes  about  where  identifiers  are  known,  it  imple¬ 
ments  a  scoping  policy.  In  our  example  language  an  identifier  and  its  definition  will  be  known 
throughout  the  remainder  of  the  enclosing  block.  This  is  a  typical  block  structured  view  with  the  addi¬ 
tional  requirement  of  definition  before  use  often  imposed  to  allow  one  pass  compilation.  Although 
extra  mechanism  would  be  needed  to  allow  mutually  recursively  defined  types  and  routines,  this  view  is 
sufficient  to  illustrate  the  issues  of  scoping. 

Because  the  rewrite  rules  are  based  on  the  grammar,  it  is  important  that  the  grammar  we  use 
reflect  the  view  of  scoping: 

scope  =  dcln  scope  1  stmt 


In  order  to  give  a  rule  for  the  composition  of  a  declaration  with  a  scope,  it  is  now  easier  to  work 
with  an  assertion  function  cleanSc  that  subsumes  the  assertion  function  for  declarations,  cleanD.  The 
cleanD  rules  will  serve  as  the  basis  for  cleanSc.  For  example,  translating  the  variable  declaration  rule, 
we  have: 

cleanSc(  var  varld  type  scope) 
cleanT(type)  &  cleanSc(scope) 

After  the  declarations  have  been  peeled  off,  we  also  need: 

cleanSc(stmt)  cleanSt(stmt) 

(We  could  instead  have  included  cleanSt  as  part  of  cleanSc.) 

Substitution  Functions  for  Variable  Declarations 

This  variable  declaration  rule,  of  course,  does  not  include  the  effect  of  the  declaration  on  its  scope. 
In  particular,  a  variable  declaration  restricts  the  variable  to  be  of  the  specified  type.  This  restriction  is 
enforced,  for  example,  by  the  assignment  rule  given  earlier: 

cleanSt(varRef ’:  =  ’  expr) 

“*■  validRef(varRef)  &  cleanE(expr) 

8t  asgmtTCCexpr,  ItsType(varRef)) 

After  ItsType{varRef)  has  been  rewritten  in  terms  of  TYPE<varld>,  TYPE<varld>  must  be 
replaced  by  the  associated  type  throughout  its  scope.  TSub  performs  that  replacement 

cleanSc(  var  varld  type  scope) 
cleanT(type) 

&  TSub<type,  TYPE<varId>,  cleanSc(scope)  > 


-37- 


.Except  for  the  composite  nature  of  the  identifier  TYPE<varid>,  this  substitution  could  have  been 
denoted  in  a  classical  program  proving  logic  as 

(TYPE<varId> 

cleanSc{scope)i 

[type 


For  this  view  of  scoping,  the  type  substitution  function  is  a  straight  textual  substitution,  subject  to 
the  constraint  that  substitution  function  invocations  are  properly  nested:  inner  substitution  functions 
must  be  applied  first.  The  other  substitution  functions  we  shall  need  for  our  example  language  will  also 
be  textual  substitutions. 

There  are  two  aspects  of  scoping  captured  by  this  rule.  First,  the  rule  defines  where  in  the  program 
"varld"  can  be  known.  By  defining  which  occurrences  of  "varld"  to  replace,  we  could  implement  more 
restrictive  scoping  policies  such  as  the  import  restrictions  of  Modula  or  Euclid.  Second,  because  the 
rule  is  recursive,  an  outer  declaration  will  be  masked  by  inner  declarations,  thus  defining  at  least  in  part 
where  "varld"  is  not  known. 

Type  Declarations 

Although  there  are  no  declared  attributes  of  declared  types,  there  is  still  a  need  for  substitution 
functions  in  defining  the  clean  execution  of  type  declarations.  Consider,  for  example,  rewrite  functions 
that  determine  whether  or  not  a  type  is  numeric  or  whether  a  type  is  simple  or  structured.  These  func¬ 
tions  formalize  properties  of  the  underlying  types,  that  is,  the  types  and  type  constructors  of  the  pro¬ 
gramming  language  and  not  type  names.  Unless  type  names  are  replaced,  additional  mechanism  must 
be  introduced  into  all  rewrite  functions  that  deal  with  underlying  types. 

The  effect  of  a  type  declaration  is  a  substitution.  We  can  capture  that  effect  using  the  same  type 
substitution  function  we  used  for  variable  type  information: 

cleanScftype  typeld  type  scope) 
cleanTftype) 

&  TSub<type,  typeld,  cleanSc(scope)  > 


Constant  Declarations 

There  are  also  no  declared  attributes  for  constant  declarations.  (Since  a  constant  is  assigned  a 
value  only  once,  we  need  its  type  only  to  check  the  declaration.) 

Just  as  we  had  to  include  the  effect  of  previous  statements  in  defining  the  clean  execution  of 
sequential  statements,  we  must  include  the  effect  of  the  assignment  defined  by  a  constant  declaration  in 
our  clean  execution  rule.  To  avoid  a  predicate  transformer,  we  shall  use  a  substitution  function  to  cap¬ 
ture  the  assignment: 

cleanSc(coA7S/ consid  type  expr  scope) 
cleanT(type)  &  cleanE(expr) 

&  asgmtTC(expr,  type) 

&  ESub<expr,  consid,  cleanSc(scope)  > 

If  records  and  arrays  can  be  initialized  with  structured  values,  asgmtTC  must  be  extended  accordingly. 
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Var table  Declarations  Revisited 

We  still  have  a  shortcoming  of  the  variable  declaration  rule  to  take  care  of.  If  we  view  an  irreduci¬ 
ble  clean  execution  condition  as  a  tree,  evaluating  it  is  a  percolation  process:  a  term  percolates  up  the 
tree  until  either  a  predicate  transformer  or  a  substitution  function  allows  the  term  to  be  simplified  to 
true  or  false  and  eliminated. 

If  a  clean  execution  condition  depends  on  the  value  of  a  variable,  that  clean  execution  condition  is 
neither  true  or  false.  It  must  percolate  up  until  it  no  longer  depends  on  the  variable  reference. 

The  special  case  where  a  clean  execution  condition  for  the  scope  of  a  variable  declaration  depends 
on  the  value  of  that  variable  is  no  different:  it  is  still  neither  true  nor  false.  But  the  clean  execution 
condition  for  the  scope  including  the  variable  declaration  is  false.  At  the  point  of  declaration  clean  exe¬ 
cution  cannot  be  guaranteed. 

For  many  proof  methods  propositions  containing  free  variables  are  left  as  unverifiable.  Taking  this 
approach,  we  would  be  unable  to  verify,  for  example,  a  clean  execution  condition  that  depends  on  the 
value  of  an  uninitialized  variable.  (Note  that  this  is  true  whether  or  not  references  to  uninitialized  vari¬ 
ables  are  legal.) 

Yet  there  are  several  reasons  why  we  might  fail  to  verify  a  proposition:  (1)  the  proposition  to  be 
verified  might  not  be  true  (heaven  forbid!),  (2)  there  might  be  a  weakness  in  the  proof  method  (unde¬ 
cidability  is  one  possible  weakness),  and  (3)  we  may  have  failed  in  our  use  of  the  proof  method  (in 
proving  correctness  properties,  for  example,  we  may  not  have  provided  proper  lemmas  in  the  form  of 
intermediate  assertions  for  the  verification  to  go  through).  Thus,  when  we  fail  to  verify  a  proposition, 
there  is  little  information  provided  as  to  why  we  failed. 

To  ensure  that  the  clean  execution  condition  of  the  scope  associated  with  a  variable  declaration 
does  not  depend  on  the  value  of  that  variable,  we  must  reduce  any  assertion  that  is  invalid  because  it 
references  that  variable  to  FALSE  by  universally  quantifying  the  assertion  over  the  variable: 

cleanSc(var  varld  type  scope) 

cleanT(type) 

&  (V  varld) 

TSub<type,  TYPE<varId>, 
cleanSc(scope)  > 

where  "varldl"  is  a  fresh  identifier. 

A  small  problem  remains  with  this  rule.  The  universal  quantification  will  "capture"  identifiers 
introduced  by  the  procedure  invocation  rule  in  the  process  of  substituting  arguments  for  formal  param¬ 
eters.  This  is  a  classical  problem  that  plagued  early  definitions  of  Algol  60,  amongst  others.  To  handle 
this  problem,  we  must  rename  variable  identifiers: 

clean Sc( var  varld  type  scope) 

cleanT(type) 

&  (V  varldl)  ESub< varldl,  varld, 

TSub<type,  TYPE<varId>, 
cleanScfscopeV  >  > 

where  "varldl"  is  a  fresh  identifier. 
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If  variable  declarations  can  optionally  include  initialization: 

varDcln  =“  vor  varld  type  [’:=■’  expr] 

the  additional  rule  is  similar  to  the  constant  declaration  rule  above: 

cleanSc(  var  varld  type  ’  expr  scope) 
cleanT(type)  &  cleanE(expr) 

&  asgmtTC(expr,  type) 

&  ESub<expr,  varld, 

TSub<type,  TYPE<varId>,  cleanSc(scope))  > 

In  the  previous  variable  declaration  rule  we  eliminated  the  free  variable  by  universally  quantifying  it; 
here  we  eliminate  the  free  variable  by  replacing  it  with  an  expression  denoting  its  value. 

6.3  AN  EXAMPLE  APPLICATION  OF  THE  REWRITE  RULES 

We  have  developed  enough  of  our  example  language  to  show  the  clean  execution  rules  in  action. 
Consider  the  following  simple  program: 

var  x:  Int; 

var  z:  array  1..6  0/I..IOC; 

X  x-97; 
z[x]  :=*  1 

This  example  will  illustrate  how  the  rewrite  rule  process  is  mechanical  and  how  applying  rewrite  rules 
interacts  with  evaluating  weakest  preconditions  and  substitution  functions. 

To  shorten  the  derivation,  we  shall  apply  rewrite  rules  to  as  many  non-overlapping  conjunctions  as 
possible  in  a  single  simplification  step.  Also,  we  shall  perform  two  obvious  simplifications  as  we  go: 

TRUE  &  P  —  P 

p  &  p  —  p 

treating  expressions  of  our  "Assertion"  type  as  first  order  predicate  calculus  expressions.  This 
simplification  is  outside  the  rewrite  rules. 


cIetnSc{  vflr  x:,Int;  var  z:  array  1..6  0/I..IOO; 
x:  — x-97;  z(x]:- 1) 

cleanT(Int)  &  (V  x)TSub<Int,  TYPE<x>, 

cleanSc(var  z:  array  1..6  0/I..IOO;  x:™x-97;  z(x):=“l)  > 

TRUE  &  (V  x)TSub<Int,  TYPE<x>, 
cleanT(arra>'  1..6  0/I..IOO) 

&  (V  z)TSub< array'  1..6  q/T..100,  TYPE<z>, 
cleanSc(x:*x-97;  z[xl:  =  1)  >> 

(V  x)TSub<Int,  TYPE<x>, 
cleanT(1..6)  &  cleanT(1..100) 

&  (V  z)TSub<arra>'  1..6  0/I..IOO,  TYPE<z>, 
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cIeanSt(x;=*x-97;  zlx]:*  1)  >> 

-> 

(V  x)TSub<Int,  TYPE<x>, 

cleanECl)  &  cleanE(6)  &  cleanE(l)  &  cleanE(lOO) 

&  (V  z)TSub<  jrrov  1..6  o/l,.100,  TYPE<z>, 
cleanSt(x;==x-97) 

&  wp(x:  =  x-97,  cleanSt(zlx]:— D)  >> 

-> 

(V  x)TSub<Int,  TYPE<x>, 

TRUE  &  TRUE  &  TRUE  &  TRUE 
&  (V  z)TSub<arrav  1..6  0/I..IOO,  TYPE<z>, 
vaiidRef(x)  &  cleanE(x-97) 

&  asgintTC(x-97,  ItsType(x)) 

&  wp<x;  =  x-97,  validRef(z[x])  &  deanE(l) 

&  asgmtTCd,  ItsType(z(x])) >  >> 

-> 

(V  x)TSub<Inl,  TY?E<x>, 

(V  z)TSuh<arrav  1..6  0/I..IOO.  TYPE<z>, 

TRUE  &  cleanE(x)  &  cieanE(97) 

&  x-97  IN  Int  &  x-97  IN  ItsType(x) 

&  wp<x:=“x-97,  validRef(z)  &  cleanE(x) 

&  X  IN  IndexType(ItsType(z)) 

&  TRUE  &  1  IN  ItsType(z[x])>  >  > 

-> 

(V  x)TSub<Int,  TYPE<x>, 

(V  z)TSub<flrra;^  1..6  0/I..IOO,  TYPE<z>, 
yalidRef(x)  &  initialized(x)  &  TRUE 
&  X.97  IN  Int  &  x-97  IN  TYPE<x> 

&  wp<x:’-x-97,  TRUE  &  yalidRef(x)  &  initialized (x) 

&  X  IN  IndexType(TYPE<z>) 

&  1  IN  ComponentType{ItsType(z))  >  >> 

-> 

(V  x)TSub<Int,  TYPE<x>, 

(V  z)TS\x\i< array  1..6  0/I..IOO,  TYPE<z>, 

TRUE  &  X  IN  ItsType(x)  &  x.97  IN  Int  &  x-97  IN  TYPE<x> 

&  wp<x:*“x-97,  TRUE  &  x  IN  ItsType{x) 

&  X  IN  IndexType(TYPE<  z> ) 

&  1  IN  ConiponentType(TYPE<z>)  >  >> 

-> 

(V  x)TSub<Int,  TYPE<x>, 

(V  z)TSub<arraj'  1..6  0/I..IOO,  TYPE<z>, 

X  IN  TYPE<x>  &  x-97  IN  Int  &  x-97  IN  TYPE<x> 

&  wp<x:-x-97,  X  IN  TYPE<x>  &  x  IN  IndexType(TYPE<z>) 

&  1  IN  ComponentType(TYPE<z>)  >  >> 

Before  we  can  apply  IndexType  and  ComponentType,  we  must  step  outside  the  rewrite  rules: 


(V  x)TSub<Int,  TYPE<x>, 

(V  z)TSub<arra>'  1..6  0/I..IOO,  TYPE<z>, 

X  IN  TYPE<x>  &  x-97  IN  Int  &  x-97  IN  TYPE<x> 

&  x-97  IN  TYPE<x>  &  x-97  IN  IndexType(TYPE<z>) 
&  1  IN  ComponentType(TYPE<z>)  >> 
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(V  x)TSub<lnl,  TYPE<x>, 

(V  z) 

(x  IN  TYPE<x>  &  x-97  IN  Int  &  x-97  IN  TYPE<x> 

&  x-97  IN  TYPE<x> 

&  x-97  IN  IndexType( arrov  1..6  0/I..IOO) 

&  1  IN  ComponentType(<3/Ta>'  1..6  0/I..IOO))  > 

(V  x)(V  z) 

(x  IN  Int  &  x-97  IN  Int  &  x-97  IN  Int 
&  x-97  IN  Int 

&  x-97  IN  IndexType(jrrav  1..6  0/I..IOO) 

&  1  IN  ComponentTypeiarrav  1..6  o/l,.100)) 

-> 

(V  x)(V  z) 

(x  IN  Int  &  x-97  IN  Int 
&  x-97  IN  1..6  &  1  IN  1..100) 

Since  there  are  no  occurrences  of  "z",  this  is 

(V  x) 

(x  IN  Int  &  x-97  IN  Int 
&  x-97  IN  1..6  &  I  IN  I..100) 

which  is  false,  since  x-97  IN  Int  is  not  true  for  all  "x". 

The  rewrite  rules  have  been  designed  so  that  each  rule  application  is  straightforward.  As  a  result, 
applying  the  rewrite  rules  takes  many  little,  well  understood  steps.  In  this  example  with  two  declara¬ 
tions  and  two  assignment  statements,  there  are  36  rule  applications,  13  obvious  simplifications,  a 
derivation  of  a  weakest  precondition,  and  two  substitution  function  applications.  This  is  a  tedious  pro¬ 
cess,  especially  for  examples  of  any  size.  The  derivation  process,  however,  except  for  deriving  weakest 
preconditions,  is  completely  mechanizable.  ISI’s  AFFIRM  system  [Musser  80],  for  example,  includes  a 
general  rewrite  rule  processor. 

/ 

6.4  CONCLUDING  REMARK 

In  the  previous  chapter  we  defined  the  pure  rewrite  rule  part  of  our  formalism.  In  this  chapter  we 
added  predicate  transformers  and  substitution  functions  to  allow  programs  to  be  composed  from  state¬ 
ments  and  declarations.  In  the  next  chapter  we  develop  the  remaining  part  of  our  formalism,  methods 
for  dealing  with  recursive  and  iterative  structures. 
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7.  CLEAN  EXECUTION  OF  PARAMETERIZED  DECLARATIONS 

Procedure  and  function  declarations,  loops,  and  parammeterized  type  declarations  can  all  be  viewed 
as  parameterized  declarations.  This  view,  which  this  chapter  explores,  highlights  two  unifying  aspects  of 
these  constructs:  parameters  and  recursive  definition. 

If  declarations  are  assertions,  parameterized  declarations  are  conditional  assertions.  For 
unparameterized  declarations,  once  the  legality  of  the  declaration  was  established,  asserted  properties  of 
the  declaration  could  be  used  in  establishing  properties  of  the  associated  scope.  For  a  parameterized 
declaration,  the  asserted  properties  are  conditional  upon  the  legality  of  the  arguments  used  in  invoca¬ 
tions. 

7.1  UNBOUNDED  OPERATIONS 

We  now  consider  operators  with  an  unbounded  number  of  possible  execution  steps:  loops  and 
recursive  functions  and  procedures.  No  formalism  can  capture  the  transformation  effected  by  a  poten¬ 
tially  unbounded  operation  without  introducing  a  potentially  unbounded  structure  of  its  own.  Induction 
is  one  way  of  doing  this;  recursive  definitions  such  as  fixed  point  operators  are  another.  Thus,  there 
can  be  no  purely  algorithmic  way  of  deriving,  for  example,  the  predicate  transformer  of  Dijkstra’s  DO 
statement  or  of  a  recursive  routine  body. 

One  approach  to  this  problem  is  for  the  language  designer  to  provide  inference  rules  and  the  pro¬ 
grammer  to  supply  sufficient  descriptions  of  routines  and  loops  to  prove  any  properties  of  interest. 
Pre/postconditions  and  loop  invariants  are  common  ways  of  providing  such  partial  descriptions. 

Although  this  is  principally  a  concern  in  proving  correctness  properties,  there  are  two  ways  in 
which  it  is  a  concern  in  proving  clean  execution.  Consider  the  situation  for  recursive  functions.  First, 
even  if  we  could  derive  the  transformation  effected  by  a  function,  the  clean  execution  rules  for  func¬ 
tions  would  be  inadequate.  In  deriving  a  clean  execution  condition  for  a  function  body,  we  cannot  in 
general  derive  the  clean  execution  condition  for  recursive  invocations  of  the  function.  Second,  after 
proving  that  a  function  invocation  is  legal,  we  must  prove  that  its  returned  value  is  used  legally.  To  do 
this,  we  need  a  sufficient  characterization  of  the  function’s  value  to  prove  the  clean  execution  of 
enclosing  program  structures. 

In  both  cases  the  situation  is  similar  to  proving  correctness  properties  about  a  recursive  function. 
The  language  designer  must  define  inference  rules  so  that  the  programmer  can  provide  additional  asser¬ 
tions  from  which  clean  execution  can  be  proved.  The  additional  assertions  must  in  turn  be  verified. 

7.2  LOOPS 

The  clean  execution  issues  of  parameterized  declarations  are  perhaps  most  simply  addressed  for 
loops.  Loops  are  a  degenerate  (single  instance)  case  of  parameterized  declaration  where  declaration  and 
invocation  merge.  Although  loops  can  be  dismissed  as  recursive  procedures,  they  differ  in  form  and  in 
the  way  assertions  are  usually  associated  with  them. 

Let  us  include  while  loops  then  in  our  example  language. 


A  while  loop 
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stmt  *  ...  I  while  expr  do  stmt  od 

executes  cleanly  if  the  Boolean  expression  and  the  loop  body  can  be  executed  cleanly  as  need  be.  Slat¬ 
ing  this  condition  recursively,  we  have  the  non-terminating  rewrite  rule: 

cleanSt(w/7//e  expr  do  stmt  od) 

cleanE(expr)  &  (-•  expr  |  cleanSt(stmt) 

Sl  wp<stmi,  desiikStiw/iile  expr  do  stmt  od)>) 


Rather  than  a  pre/ postcondition  description,  a  loop  is  typically  described  by  an  invariant  assertion 
that  is  true  whenever  the  loop  condition  is  to  be  evaluated: 

stmt  *  ...  1  while  expr  inv  asn  do  stmt  od 

The  above  non-terminating  clean  execution  rule  for  the  while  statement  suggests  the  following  infer¬ 
ence  rule 

asn  D  cleanE(expr) 

asn  &,  expr  12  cleanSt(stmi)  &  wp<stmt.  asn> 


asn  D  clean  St  expr  inv  asn  do  stmt  od) 

The  first  premise  enforces  the  good  programming  practice  of  including  conditions  that  ensure  clean  exe¬ 
cution  in  the  invariant  description  of  a  loop.  The  second  premise  ensures  both  that  the  body  of  the 
loop  can  be  executed  cleanly  if  the  assertion  and  loop  condition  are  satisfied  and  that  the  assertion  is 
indeed  an  invariant.  If  the  assertion  is  invariant,  cleanE(expr)  is  reestablished,  by  the  first  premise. 

The  consequent  of  this  inference  rule  justifies  the  following  rewrite  rule: 

cleanSt()v/7/7e  expr  inv  asn  do  stmt  od)  asn 


The  invariant  "asn"  need  only  be  a  clean  execution  invariant  and  need  not  be  sufficient  to  prove 
correctness  properties.  This  is  the  first  point  in  defining  clean  execution  for  our  example  language 
where  a  clean  execution  condition  might  be  stronger  than  necessary  for  the  clean  execution  of  the  asso¬ 
ciated  construct. 

If  the  while  loop  must  meet  the  pre/postcondition  pair 
P  D  wp<  while  expr  inv  asn  do  stmt  od,  Q> 
the  invariant  must  satisfy  additional  premises 
P  D  asn 

asn  k  -•  expr  D  Q 

wp<  while  expr  inv  asn  do  stmt  od,  TRUE> 

A  partial  correctness  rule  that  includes  clean  execution  of  the  loop  condition  and  the  first  two  correct¬ 
ness  premises  is  given  in  [German  78]. 
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7.3  NON-RECURSIVE  FUNCTION  DECLARATIONS 

Let  us  now  consider  function  declarations.  We  must  first  decide  when  a  function  declaration  is 
quaranteed  to  execute  cleanly.  However  natural  the  choice  may  have  seemed,  we  have  made  this 
choice  each  time  we  have  "defined"  a  clean  execution  rule.  Let  us  decide  that  a  function  declaration 
can  be  executed  cleanly  if  (1)  the  types  of  the  formal  parameters  are  legal,  (2)  the  type  (if  any)  of  the 
return  value  is  legal,  (3)  the  function  body  can  always  be  executed  cleanly,  and  (4)  a  legal  value  is 
returned.  (We  shall  consider  below  functions  where  clean  execution  of  the  body  depends  on  the  argu¬ 
ments  of  an  invocation.) 

The  clean  execution  rule  to  enforce  this  decision  depends  on  the  nature  and  form  of  function 
declarations.  Let  us  consider  functions  where  there  is  no  return  statement  and  a  value  is  returned  by 
assignment  to  a  distinguished  return  identifier: 

fcnDcln  =•  fen  fenid  [’(’  fenParmList  ’)’] 

re/wrnsvarld  type  ’=*’  routineBody 
fenParmList  =*  parmid  type  parmid  type) 
routineBody  =*  begin  scope  end 

The  function  declaration  rule  is  perhaps  most  straightforwardly  stated  as: 

cleanSc(/cn  fenId  [’(’  fenParmList  ’)’] 

returns  \2s\6.  type  ’*«’  routineBody  scope) 

[cleanPL(fcnParmList)  &] 
clean  T(  type) 

&  cleanSefroutineBody) 

&  wp< routineBody,  initialized(varld)  > 

8i  cieanS€(scope) 

wp<  routineBody,  initialized(varld)  >  is,  by  the  definition  of  wp,  the  weakest  condition  that  ensures 
that  a  legal  value  is  returned. 

cleanPL's  job  is  to  check  that  the  types  of  the  formal  parameters  are  legal.  We  shall  design 
cleanPL  to  accommodate  both  function  and  procedure  parameter  lists: 

parmList  *  fenParmList  |  proeParmList 
fenParmList  =  parmid  type  parmid  type) 
proeParmList  =*  [var]  parmid  type 

[var]  parmid  type) 

cleanPL  need  only  check  its  components: 

cleanPL:  ParmList  —  >  Assertion 

cleanPL([vaH  parmid  type 

[var]  parmidl  typel}) 
cleanT(type)  {& cleanT(typel)} 


The  above  function  declaration  rule,  however,  allows  a  predicate  transformer  to  intrude.  At  the 
price  of  introducing  artificial  program  text  and  of  a  slightly  more  cumbersome  rule,  we  shall  formulate 
the  rule  instead  as: 
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cleanScOtn  fcnJd  [’(’  fcnParmLisi  ')’] 
returns  varld  type 

begin  scope  end"''  scopel) 

[cleanPL(fcnParmList)  &] 
cleanT(type) 

&  cleanSc(scope  varld  ’  varld) 

&  cleanSc(scopel) 

(The  right-hand  side  of  this  rule  reduces  to  the  right-hand  side  of  the  previous  rule  using  the  assign¬ 
ment  statement  and  statement  sequence  rules.)  This  rule  takes  advantage  of  our  earlier  language  deci¬ 
sion  that  referencing  an  uninitialized  variable  is  illegal. 

In  order  to  make  the  scope  of  the  routine  body  accessible  in  this  last  rule,  we  pushed  grammatical 
detail  into  the  rule  by  expanding  intermediate  non-terminals.  In  writing  rewrite  functions,  there  are 
often  tradeoffs  between  cluttering  up  a  rule  by  expanding  intermediate  non-terminals  on  one  hand  and 
introducing  intermediate  functions  to  expand  the  grammar  rules  more  gradually  on  the  other. 

On  the  Advantages  of  Syntactic  Handles 

A  small  change  to  the  way  a  construct  is  provided  in  a  language  can  make  a  big  difference  in  for¬ 
mulating  a  crisp  clean  execution  rule  for  the  construct.  Consider,  for  example,  a  different  flavor  of 
functions  where  a  function  value  is  returned  via  a  returns  statement: 

stmt  *  ...  1  returns  ’(’  expr  ’)’ 

and  where  the  function  declaration  specifies  the  type  of  the  returned  value: 

fcnDcln  “  fen  fenid  (’(’  fenParmList  ’)’] 
type  routineBody 

We  must  add  a  rule  to  cleanSt  for  the  new  statement.  But  how  do  we  formulate  the  necessary  compa¬ 
tibility  check?  (In  the  previous  case,  this  check  was  subsumed  by  the  assignment  rule.)  Since  there  is 
no  name  for  the  return  value,  we  must  make  one  up: 

clennSti  re  turns  '('  expr  ’)’) 

cleanE(expr)  &  asgmtTC(expr,  RETTYPEO) 

Now  how  do  we  formulate  the  constraint  in  the  function  declaration  rule  that  a  legal  value  is  returned? 
Again  the  problem  is  the  lack  of  name,  and  again  the  solution  must  be  a  declared  attribute,  this  time 
one  that  serves  as  an  artificial  variable. 

To  the  programmer  the  differences  between  this  and  the  earlier  flavor  of  function  are  largely 
cosmetic.  In  formalizing  clean  execution  constraints,  however,  the  differences  require  two  more 
declared  attributes  and  two  more  substitution  functions.  We  shall  retreat  to  our  earlier  form  of  func¬ 
tion  as  the  basis  for  a  more  complete  rule  below. 

Substitution  Functions  for  Function  Declarations 

The  function  invocation  rule  given  in  Chapter  5  specified  a  parameter  type  compatibility  constraint 
in  terms  of  the  parameter  list  PARMS<fcnld>.  The  function  declaration  rule  must  supply 

PARMS<fcnId>: 
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cleanSc(/cw  fcnld  ’(’  fcnParmLisl  ’)" 
remr/7s  varld  type 

begin  scopQ  end'\'  scopel) 

— *’  cleanPL(fcnParmList) 

&,  cleanT(type) 

8t  cieanSc(scope  varld  ’  varld) 

&  PSub<fcnParmList,  PARMS<fcnId>, 
deanSc(scopel)  > 

(We  shall  treat  only  functions  (and  procedures)  that  have  parameters  in  the  rest  of  this  chapter.  Func¬ 
tions  without  parameters  can  only  compute  a  fixed  constant  and,  more  to  the  point  here,  their  clean 
execution  is  a  degenerate  case  of  that  for  functions  with  parameters.  An  additional  rule  would  have  to 
be  supplied  for  the  parameterized  case.) 

A  stronger  function  declaration  rule  is  possible.  In  proving  cleanSc(scope  varld  varld),  we 
can  assume  the  parameters  initially  have  legal  values.  We  also  need  the  type  of  the  return  parameter: 

cleanSc(/cA7  fcnld  ’(’  fcnParmList  ’)’ 
returns  varid  type 
’  =  ’  begin  scope  end'\'  scopel) 
cleanPL(fcnParmList) 

&  cleanT(type) 

&  (valueParmsInitd(fcnParmList) 

D  validFB(scope,  varld,  type)) 

&  PSub< fcnParmList,  PARMS<fcnId>, 
cleanSc(scopel)  > 

valueParmslnitd:  ParmList  — >  Assertion 

valueParmsInitd(parmld  type)  initialized(parmld) 

valueParmsInitd(vur  parmid type)  TRUE 

yalueParmsInitd(parmId  type  parmList) 

initialized(parmld)  &  valueParmsInitd( parmList) 

valueParmsInitd(vur  parmid  type  parmList) 
yalueParmslnitd(parmList) 


The  implication  in  the  term 

(yalueParmsInitd(fcnParmList) 

D  yalidFB(scope,  varld,  type)) 

allows  the  assumption  yalueParmsInitd(fcnParmList)  to  be  used  in  proving  vaIidFB(scope.  varid, 
type).  As  we  saw  earlier,  it  is  the  responsibility  of  the  function  invocation  rule  to  discharge  this 
assumption.  The  function  validFB  encapsulates  the  clean  execution  constraints  on  a  function  body: 

yalidFB:  Scope  x  Varld  x  Type  —  >  Assertion 

yalidFB(scope,  varld,  type) 

TSub<type,  TYPE<varld>, 
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/ 

cleanSc(scope  varld  ’  varlcl)> 


Functions  Where  a  Clean  Body  Depends  on  the  Invocation 

Thus  far  we  have  assumed  that  both  the  clean  execution  of  a  function  body  and  a  legal  value  being 
returned  are  properties  of  the  function  declaration.  There  are  functions  where  this  is  not  the  case,  i.e., 
whenever,  validFB(scope,  varld,  type)  cannot  be  simplified  to  true  assuming 
valueParmsInitd(fcnParmList).  If  not,  this  term  is  a  precondition  on  the  function's  legal  invocation. 

To  include  this  precondition  in  a  function  invocation  rule,  we  must  introduce  declared  attributes: 

cleanE(fcnId  ’(’  exprList  ')') 

“*■  YalidParms(exprLisU  PARMS<  fcnld>) 

St  pannInfo(PARMS< fcnld> ,  exprList, 

validFB(BODY < fcnld> ,  RETVAR< fcnld> , 

RETTYPE<fcnId>)) 

BODY<fcnId>,  RETVAR< fcnld> ,  and  RETTYPE<fcnId>  are  the  routine  body,  return  variable, 
and  return  type  respectively  of  "fcnid".  Corresponding  functions  must  be  supplied  by  the  function 
declaration  rule. 

parminfo  replaces  value  parameters  with  their  corresponding  arguments: 

parmlnfo:  ParmList  x  ExprList  x  Assertion  ->  Assertion 

pannlnfo(parmld  type,  expr,  assertion) 

”*■  ESub<expr,  parmid,  assertion> 

pannInfo(parmld  type  parmList, 
expr  exprList,  assertion) 

ESub<expr,  parmid, 

parmInfo(parmList,  exprList,  assertion)  > 


A  more  complicated  definition  is  recjuired  if  formal  parameters  can  be  redefined  within  the  func¬ 
tion  body.  Note  that  this  function  invocation  rule  makes  an  assumption  about  scoping  of  names:  global 
identifiers  referenced  in  the  function  body  must  be  accessible  at  the  point  of  function  invocation.  That 
is,  dynamic  and  static  scoping  must  agree.  The  scoping  rules  of  Euclid  were  made  more  complex  to 
ensure  that  this  assumption  could  be  enforced. 

With  this  change  to  the  function  invocation  rule,  the  function  declaration  rule  has  one  less  con¬ 
junct  and  three  more  substitutions: 

cleanSc(/c/?  fcnid  ’(’  fcnParmList  ’)’ 
returns  type 

begin  scope  end'-'  scopel) 
cleanPL(fcnParmList) 

St  cleanT(type) 

St  BSub< scope,  BODY<fcnId>, 

RVSub<varId,  RETVAR< fcnld> , 

RTSub<type,  RETTYPE< fcnld>, 

PSub<  fcnParmList,  PARMS<fcnId>, 
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cieanSc(scopel)  >>>> 

Since  the  substitutions  are  independent,  the  order  of  substitution  does  not  matter.  This  rule  is  as  bulky 
as  it  is  because  we  must  be  explicit  about  associating  declared  information  with  its  uses. 

7.4  RECURSIVE  FUNCTION  DECLARATIONS 

Let  us  now  consider  functions  where  clean  execution  of  the  function  body  depends  on  the  invoca¬ 
tion.  The  function  invocation  rule  can  no  longer  be  applied  within  the  body  of  a  recursive  function  in 
the  same  way  as  outside  the  body. 

The  programmer  must  supply  additional  assertions.  A  natural  form  for  the  additional  assertions  to 
take  is  a  precondition  sufficient  to  guarantee  the  function’s  clean  execution: 

fcnDcln  “  fen  fenid  ’(’  fenParmList  ’)’ 
returns  varld  type 
’  =“  ’  pre  precond 
begin  routineBody  end 

The  precondition,  which  need  not  be  a  precondition  for  the  function’s  meaningful  execution,  can  then 
be  used  where  the  clean  execution  condition  for  the  function's  invocation  is  needed.  Thus,  in  conU'ast 
to  the  rule  for  invocations  of  non-recursive  functions: 

cleanE(fcnId  ’(’  exprList  ’)’) 

validParms(exprList,  PARMS<fcnId>) 

&  pannInfo(PARMS<fcnId>,  exprList, 

validFB(BODY< fcnld> ,  RETVAR<  fcnld> , 

RETTYPE<fcnId>)) 


we  have 

cleanE(fcnId  ’(’  exprList  ’)’) 

validParmsCexprList,  PARMS<fcnId>) 

&  parmInfo(PARMS<fcnId>,  exprList,  PRE<fcnId>) 

pannlnfo,  defined  above,  substitutes  arguments  for  formal  parameters  in  the  precondition 
PRE<fcnId>,  a  declared  attribute. 

In  order  to  use  the  rewrite  rule  for  recursive  function  invocations,  we  must  verify  that  the  specified 
assertion  is  a  valid  precondition.  Let  us  assume,  with  no  loss  of  generality,  a  single  parameter. 
Expressing  the  function  declaration  more  tersely: 

fen  f  ’(’  p  t  ’)’  returns  r  tl 
’  *•  ’  pre  P  begin  S  end 

the  function  invocation  rewrite  rule  corresponds  to  the  implication 
(Vp:  t)(P  D  cIeanE(f(p))] 


The  precondition  P  is  valid  if 
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(Vp:t)[P  D  c!eanE(f(p))) 

I—  (Vp:  t)[P  D  cleanSc(S)) 

This  is  all  naturally  described  as  an  inference  rule; 

(Vp:  t)(P  D  cleanE{f(p))] 

1—  (Vp:  t)lP  D  cleanSc(S)] 


(Vp:  t)lP  D  cleanE(f(p))] 

In  order  to  use  the  rewrite  rule  corresponding  to  the  consequence  of  this  inference  rule,  we  must  verify 
the  antecedent  of  the  inference  rule. 

Proving  that  Uses  of  Function  Invocations  are  Legal 

In  proving  the  clean  execution  of  the  body  of  a  recursive  function,  we  needed  help  in  deriving  the 
clean  execution  condition.  In  proving  that  a  function  invocation  is  properly  used,  we  need  help  not  in 
deriving,  but  in  verifying  clean  execution  conditions  for  enclosing  program  structures. 

The  problem  is  the  same  as  in  verifying  correctness  properties  involving  the  function’s  value:  we 
need  a  sufficient  characterization  of  the  function’s  value  to  verify  the  properties  of  interest.  The  usual 
correctness  mechanisms  for  providing  this  characterization  apply  here,  such  as  pre/ postcondition  pairs. 

The  choice  of  mechanism  is  closely  tied  to  both  the  way  in  which  the  meaning  of  programs  is 
defined  and  the  means  of  verification  used.  If  pre/postcondition  pairs  are  used,  the  precondition  must 
also  be  a  precondition  for  the  function’s  legal  invocation,  as  discussed  earlier. 

For  the  sake  of  the  example  in  the  next  section,  we  shall  choose  pre/postcondition  pairs: 

cleanSc(/c;7  fcnid  ’(’  fcnParmList  ’)’ 
returns  varld  ’:’  type 
’  — ’  pre  asn  post  asnl 
begin  scope  end'\'  scope  1) 
cleanPL(fcnParmList) 

&  cleanT(type) 

&  PSub< fcnParmList,  PARMS<fcnId> , 

PreSub<asn,  PRE<fcnId>, 
cleanSc(scopel)  >> 

(There  would  be  little  reason  otherwise  for  us  to  make  a  choice  on  a  purely  verification  problem  since 
we  are  nearly  through  defining  our  example  language.) 

A  very  simple  proof  rule  will  enable  us  to  prove  the  properties  of  interest  in  the  example  below: 
(Vp:t)lP  3  wp<S,Q>] 


(Vp:t)(P  D  Q(f(p)/r)] 

This  proof  rule,  which  simply  ensures  that  the  pre/postcondition  is  a  valid  description  of  the  function 
body,  is  based  on  the  skeleton 
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pre  P  posi  Q  begin  S  end 

Since  we  assumed  that  function  formal  parameters  could  not  be  assigned  values,  no  renaming  conven¬ 
tion  is  needed  to  distinguish  the  values  of  the  parameters  before  and  after  invocation. 

7.5  A  WORKHORSE  REVISITED;  FACTORIAL  PROGRAM 

To  illustrate  the  clean  execution  rules  and  associated  inference  rules  for  recursive  functions,  we 
shall  use  a  familiar  example  for  program  proving  methods,  a  factorial  function  together  with  an  invoca¬ 
tion  of  that  function: 

fen  fact(n:  Int)  returns  f:Int 

=  pre  0  ^  n  &  n!  IN  Int;  post  f  —  n! 
begin  if  n—0  then  f:“  1  else  f:“n*fact(n-l)  fi  end, 
x:=*fact(10) 

0  ^  n  a:  n!  IN  Int  is  both  a  clean  execution  and  a  correctness  precondition  for  "fact". 

Our  goal  is  a  clean  execution  condition  for  this  program  segment  involving  only  the  type  of  the 
variable  "x"  and  the  type  "Int",  both  of  which  are  defined  external  to  the  program  segment. 

There  are  two  parts  to  the  derivation.  The  bulk  of  the  derivation  is  done  using  only  the  rewrite 
rules  and  substitution  functions  defined  above  (together  with  the  simplifications  TRUE  &  P  P  and 
P  &  P  P).  The  resulting  clean  execution  condition  involves  the  term  fact(lO),  however,  which  the 
second  part  of  the  derivation  replaces  using  the  pre/postcondition  description  of  "fact". 

Each  part  of  the  derivation  requires  a  supporting  condition  to  be  verified.  First,  since  we  used  the 
pre/postcondition  description  in  replacing  fact(lO),  we  must  verify  the  pre/ postcondition  of  "fact".  We 
sketch  this  verification  using  classical  program  verification  methods  as  Lemma  1  below.  Second,  we 
used  the  function  invocation  clean  execution  rule  in  deriving  a  clean  execution  condition.  We  thus 
must  prove  the  antecedent  of  the  inference  of  which  the  function  invocation  clean  execution  rule 
corresponds  to  the  consequence.  This  we  do  in  Lemma  2  below. 

In  both  lemmas  we  rely  on  familiar  properties  of  the  integers  and  of  factorial.  We  shall  assume 
about  the  type  "Int"  only  that  it  is  a  contiguous  subrange  including  0.  A  useful  property  that  such 
subranges  have  is  that  for  0  <  n,  n!  IN  Int  implies  (n-D!  IN  Int. 

Derivation: 

A  clean  execution  condition  for  the  program  segment  can  be  derived  as  follows: 

cleanSc(/c/7  fact(n:  Int)  returns  f:Int 

“  pre  0  <  n  &  n!  IN  Int;  post  f  *  n! 

begin  if  n=“0  then  f:**l  else  f:"‘n*fact(n-l)  fi  end, 

x:=“fact(10)) 

-> 

cleanPL(n:Int)  &  cleanT(Int) 

&  PSub<n:  Int,  P ARMS < fact >, 

PreSub<0  ^  n  &  n!  IN  Int,  PRE<fact>, 
cleanSc(x:*fact(10))  >> 

-> 


cieanT(Int)  &  TRUE 
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&  PSub<n:  Int.  PARMS<facl>, 

PreSub<0  ^  n  &  n!  IN  Inl,  PRE<fact>, 
cleanSt(x;  — fact(lO))  >> 

-> 

TRUE 

&  PSub<n:  Int,  PARMS<facl>, 

PreSub<0  ^  n  &  n!  IN  Int,  PRE<fact>, 
validRefIx)  &  cleanE^factdO^) 

&  asgmtTC(fact(10),  ItsType(x))  >> 

(In  order  to  use  the  function  invocation  clean  execution  rule,  we  must  verify  Lemma  2.) 
-> 

PSub<n:  Int,  PARMS<fact>, 

PreSub<0  ^  n  &  n!  IN  Int,  PRE<fact>, 

TRUE  &  validParmsdO,  PARMS< fact>), 

&  parmInfo(PARMS< fact> ,  10,  PRE<fact>) 

&  factClO)  IN  ItsType(x)  >> 

-> 

PSub<n;  Int,  PARMS<fact>, 

PreSub<0  ^  n  &  n!  IN  Int,  PRE<fact>, 
validParmsdO,  PARMS<fact>), 

&  parmInfo(PARMS< fact> ,  10,  PRE<fact>) 

&  factdO)  IN  TYPE<x>  >> 

validParmsdO,  n:  Int) 

&  parmInfo(n:  Int,  10,  0  <  n  &  n!  IN  Int) 

&fact(10)  IN  TYPE<x> 

-> 

cleanEdO)  &  valueParmTCClO,  Int) 

&  ESub<10,  n,  0  ^  n  &  n!  IN  Int> 

AfactdO)  IN  TYPE<x> 

-> 

TRUE  &  asgmtTCdO,  Int) 

&  ESub<  10,  n,  0  <  n  &  n!  IN  Inl> 

AfactdO)  IN  TYPE<x> 

-> 

10  IN  Inl 

&  ESub<10,  n,  0  <  n  &  n!  IN  Int> 

&  factdO)  IN  TYPE<x> 

10  IN  Int 

&0  <  10  &  10!  IN  Int 
&  factdO)  IN  TYPE<x> 


It  remains  to  replace  factdO).  We  can  do  this  using  the  postcondition  of  "fact"  if  we  can  show  that 
the  precondition  0  ^  n  &  n!  IN  Int  is  satisfied.  (This  assumes  that  the  pre/ postcondition  of  "fact"  has 
been  verified,  which  we  do  in  Lemma  1.)  But  the  middle  two  conjuncts  of  the  clean  execution  condi¬ 
tion  are  exactly  the  precondition  needed. 
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Thus,  the  desired  clean  execution  condition  is: 

10  IN  Int 

&0  ^  10  &  10!  IN  Int 
&  10!  IN  TYPE<x> 

This  clean  execution  condition  must  then  be  verified  based  on  the  types  "Int"  and  TYPE<x>. 
Lemma  1  (verify  the  pre/postcondition  for  "fact"): 

We  must  verify  the  antecedent  of  the  inference: 

(Vn:  lnt)[0  ^  n  ^  n!  IN  Int  D  f  —  ni>] 


(Vn:  lnt)[0  ^  n  &  n!  IN  Int  D  fact(n)  *n!] 

Strengthening  the  precondition  with  n!  IN  Int  changes  notning  in  this  oft  repeated  verification  exercise, 
which  we  shall  not  repeat  here.  The  0  <  n  in  the  precondition  ensures  termination:  each  invocation  of 
"fact"  either  terminates,  immediately  or  decreases  the  (non-negative)  value  of  "n". 

Lemma  2  (verify  the  premise  of  the  inference  assumed  by  the  function  invocation  rule): 

We  now  must  verify  the  premise  of 

(Vn:  lnt)[0  ^  n  &  n!  IN  Int  D  cleanE(fact(n))] 

I—  (Vn:  lnt)[0  ^  n  &  n!  IN  Int  D  cleanSc(//..y7)] 


(Vn:  lnt)[0  ^  n  &  n!  IN  Int  D  cieanE(fact(n))] 


Let  us  first  derive  cleanSc(//..y?),  More  precisely,  as  noted  in  the  previous  section,  we  must  derive 
TSub<Int,  TYPE<f>,  cleanSc(//..y7)  >: 

TSub<Int,  TYPE<f>, 

cleanSc(// n=0  then  f:=*l  else  f:=*n*fact(n-l)  /i)> 

-> 

TSub<Int,  TYPE<f>, 

cleanSt(//n“0  ///ewf:*!  e/se  f:  =  n*fact(n-l)  y?)  > 

-> 

TSub<Int,  TYPE<f>, 
cIeanE(n  =  0)  &  (n  -  0  &  cleanSt(f:  — 1) 

I"*  n  —  0  &  cleanSt(f:=“n*fact(n-l))) > 

-> 

TSub<Int,  TYPE<f>, 
cleanE(n)  &  cleanE(O) 

&  (n  -  0  &  validRef(f)  &  cleanE(l) 

&  asgmtTCd,  ItsType(f)) 

I"*  n  "  0  &  validRef(f)  &  cleanE(n*fact(n-l)) 

&  asgmtTC(n*fact(n-l),  ItsType(f))) > 


TSub<Int,  TYPE<f>, 
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TRUE  Si  TRUE 

&  (n  -  0  &  TRUE  &  TRUE  &  1  IN  ItsType(f) 

1-  n  -  0  &  TRUE  &  cleanE(n)  &  cleanE(fact(n-l)) 

&  n*fact(n-l)  IN  Int 
&  n*fact(n-l)  IN  ItsType(0)> 

TSub<Int,  TYPE<f>, 

(n  -  0  &  1  IN  TYPE<f> 

1-  n  -  0  &  TRUE  &  cleanE(facl(n-l))  &  n*fact(n-l)  IN  Int 
&n*fact(n-l)  IN  TYPE<f>)> 

(n  -  0  &  1  IN  Int 

|  -  n  -  0  &  TRUE  Si  cleanE(fact(n-l))  &  n*fact(n-l)  IN  Int 
Si  n*fact(n-l)  IN  Int) 

(n  -  0  &  1  IN  Int 

1^  n  -  0  &  cleanE(fact(n-l))  &  n*fact(n-l)  IN  Int) 


Let  us  assume 

(1)  (Vn;  lnt)(0  ^  n  &  n!  IN  Int  D  cIeanE(fact(n))] 

It  remains  to  prove  for  (Vn:  Int) 

(0  ^  n  &  n!  IN  Int  D 
(n  -  0  &  1  IN  Int 

I”*  n  *  0  &  cleanE(fact(n-l))  &  n*fact(n-l)  IN  Int)) 

This  is  vacuously  true  for  negative  "n"  in  "Int".  For  non-negative  "n"  we  proceed  by  induction.  This  is 
obviously  true  for  n  *■  0.  Let  us  assume  that  it  is  true  for  all  k  ^  n-1.  We  must  show  that  it  is  true 
for  k  *  n.  If  0  ^  k  &  k!  IN  Int  is  false,  the  condition  above  is  true.  Let  us  assume  then 

(2)  0  ^  k  &  k!  IN  Int 

This  allows  us  to  simplify  what  it  remains  to  prove  to 

(k  -  0  &  1  IN  Int 
h  k  -  0  &  0  <  k-1  &  (k-D!  IN  Int 
Si  k*fact(k-l)  IN  Int) 

By  the  contiguous  subrange  assumption,  we  can  apply  both  assumption  1  and  the  pre/postcondition 
description  of  "fact"  to  produce 

(k  -  0  &  1  IN  Int 
h  k  -  0  &  0  <  k-1  &  (k-D!  IN  Int 
Si  k*(k-l)!  IN  Int) 

Using  properties  of  the  integers  and  the  assumption  that  "Int"  is  a  contiguous  subrange,  this  becomes 
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(k  -  0  &  1  IN  Int 

|0  <  k&k*(k-l)!  IN  Int) 

which  follows  immediately  from  the  properties  of  factorial  and  (2). 

7.6  PROCEDURE  DECLARATIONS 

The  clean  execution  issues  of  procedure  declarations  are  largely  those  of  function  declarations. 
The  types  of  the  formal  parameters,  if  any,  must  be  legal,  and  the  procedure  body  must  be  cleanly  exe¬ 
cutable. 

As  for  functions,  we  shall  treat  only  procedures  with  parameters: 

procDcln  *  proc  procid  ’(’  procParmList  ’)’ 

’  —  ’  routineBody 

(Unless  we  allow  global  variables,  procedures  without  variable  parameters  are  null  operations.) 

Let  us  first  consider  the  simpler  case  where  clean  execution  of  the  procedure  body  is  a  property  of 
the  declaration.  We  implicitly  assumed  such  procedures  in  our  earlier  procedure  invocation  rule 
(rewritten  here  assuming  parameters); 

cleanSt(procId  ’(’  exprList  ’)’) 

validParms(exprList,  PARMS<  procId>) 

Analogously  with  function  declarations,  we  have  for  procedure  declarations: 

cleanSc(proc  procid  ’(’  procParmList  ’)’ 
routineBody  scope) 
cleanPL(procParmList) 

&  (valuePannsInitd(procParmList) 

D  vjrPflrm7>'pes(procParmList, 
cleanSc(routineBody) ) ) 

&  PSub<  procParmList,  PARMS<procId>, 
cleanSc(scope)  > 

varParmTypes  supplies  the  types  of  variable  parameters: 

varParmTypes:  ProcParmList  x  Assertion  —>  Assertion 

varParniTypesiparmld  type,  assertion)  assertion 

varParmTypesivar  parmld  type,  assertion) 

TSub<type,  TYPE<  parmld>,  assertion> 

varParmTyp€s{p2iTm\d  type  procParmList,  assertion) 

— ^  vflr/’flrm7y/7es( procParmList,  assertion) 

varParmTypesivar  parmld  type  procParmList,  assertion) 

TSub<type,  TYPE<parmId>, 

varParmTypesipTOcPsiTmLisU  assertion)  > 
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The  effect’  of  variable  parameters  is  specified  as  a  series  of  substitutions  applied  to 
cleanSc(routineBody).  The  effect  of  value  parameters  requires  an  implication  (as  it  did  for  function 
invocations).  In  the  procedure  invocation  rule  below  for  procedures  where  clean  execution  of  the  body 
is  a  property  of  invocations,  both  value  and  variable  arguments  are  handled  by  substitutions.  The  lack 
of  symmetry  in  the  above  rule  results  from  our  being  able  to  assume  for  value  parameters  only  that 
they  have  some  value. 

Let  us  now  consider  procedures  where  the  clean  execution  of  the  body  is  a  property  of  each  invo¬ 
cation.  The  procedure  invocation  rule  is 

deanSt(procId  ’(’  exprList  ’)')• 

validParms(exprList,  PARMS< procId>) 

&  parmInfo(PARMS<  procId> ,  exprList, 
cleanSc(BODY  <  procId>)) 

parmlnfo  must  be  expanded  to  handle  variable  parameters: 
parminfo;  ParmList  x  ExprList  x  Assertion  ->  Assertion 

parmliiiotparmid  type,  expr,  assertion) 

ESub<expr,  parmld,  assertion> 

parmInfo(vjr  parmld  type,  expr,  assertion) 

ESub<expr,  parmld, 

TSub<type,  TYPE<parmld>,  assertion>> 

parmInfo(parmId  type  parmList, 
expr  exprList,  assertion) 

ESub<expr,  parmld, 

parmInfo(parmList,  exprList,  assertion)  > 

parmInfo(var  parmld  type  parmList, 
expr  exprList,  assertion) 

ESub<expr,  parmld, 

TSub<type,  TYPE<  parmld> , 

parmInfo(parmList,  exprList,  assertion)  >> 

The  procedure  declaration  rule  is  correspondingly  simpler: 

cleanSc(/7roc  procid  ’(’  procParmList  ’)’ 

’  — ’  routineBody  scope) 

“*  cleanPL(procParmList) 

&  BSub<  routineBody,  BODY<  procid> , 

PSub<  procParmList,  PARMS<  procld> , 
cleanSc(scope)  >  > 


Recursive  Procedures 

There  are  two  problems  in  proving  the  clean  execution  of  recursive  procedures,  quite  analogous  to 
those  for  recursive  functions.  First,  in  deriving  a  clean  execution  condition  for  a  procedure  body  we 
cannot  in  general  derive  the  clean  execution  condition  for  recursive  invocations.  Second,  after  proving 
that  a  procedure  invocation  is  legal,  we  must  prove  that  its  uses  are  legal. 
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The  solutions  to  these  problems  are  also  quite  analogous  to  the  solutions  for  recursive  functions. 
We  shall  introduce  a  precondition  for  recursive  procedures  sufficient  to  guarantee  the  procedure’s  clean 
execution: 

procDcln  *  proc  procid  ’(’  procParmList  ’)’ 
pre  precond  routineBody 

The  rules  for  recursive  procedures  are  then 

cleanSt(procId  ’(’  exprList  ’)’) 

validParms(exprList,  PARMS<procId>) 

&  parmInfo(PARMS<procId>,  exprList, 

PRE<procId>) 

cleanSc(proc  procid  ’(’  procParmList  ’)’ 

’=*'  pre  precond  routineBody  scope) 
cieanPL(procParmList) 

PreSub<  precond,  ?RE<procId>, 

PSub<  procParmList,  PARMS<  procid >, 
cleanSc(scope)  >  > 

The  inference  rule  that  serves  as  the  basis  for  these  rules  is  analogous  to  that  for  recursive  functions 
above. 

A  sufficient  characterization  of  the  effect  of  the  procedure  is  needed  to  verify  the  clean  execution 
of  enclosing  program  structures.  Again,  this  is  purely  a  verification  problem  and  one  addressed  by 
existing  formalisms. 

7.1  GLOBAL  IDENTIFIERS 

Global  identifiers  complicate  the  statement  and  verification  of  correctness  properties.  As  a  result, 
many  definitions  of  proof  rules  assume  that  functions  and  procedures  have  been  rewritten  to  replace 
global  variables  and  constants  with  explicit  parameters.  Among  the  correctness  properties  affected  are 
the  pre-  and  postconditions  used  above. 

In  contrast,  global  identifiers  present  no  particular  problems  in  formulating  clean  execution  rules. 
The  clean  execution  of  a  function  or  procedure  body  may  depend  on  global  values  just  as  it  may  depend 
on  argument  values.  Dependence  on  global  values  will  be  reflected  in  the  clean  execution  condition  for 
the  function  or  procedure  body.  Thus,  all  the  preceding  clean  execution  rules  work  in  the  presence  of 
global  identifiers. 

7.2  PARAMETERIZED  TYPES 

As  the  last  construct  in  our  example  language,  let  us  include  parameterized  types  with  value 
parameters: 

namedType  “  typeld  [’{’  fcnParmList  ’)’]  ’**’  type 


In  addition  to  the  earlier  rule 
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cleanSc(jype  typeld  '  —  ’  type  scope) 
cleanT(type) 

&  TSub<type,  typeld,  cleanSc(scope)  > 
we  need  the  rule 

cleanSc(ryp^  typeld  fcnParmList  ’)  —  ’  type  scope) 
cleanPL(fcnParmList) 

&  (valueParmsIiiitd(fcnParmList) 

D  cleanT(type)) 

&  PTSub<type,  typeld  ’(’  fcnParmList  ’)’, 
PSub<  fcnParmList,  PARMS<typeId>, 
cleanSc(scope)  >> 


Previously,  substitutions  have  replaced  either  single  identifiers  or  declared  attribute  names  of  the 
form  ATTR1BUTE< identifier >.  PTSub,  however,  cannot  simply  replace  a  parameterized  type 
identifier;  it  must  also  substitute  for  its  parameters.  PTSub  thus  performs  a  macro  substitution. 

cleanT  now  includes 

cleanT(typeld  exprList  ’)’) 

“*■  TalidParms(exprList,  PARMS<  typeld>) 


If  clean  execution  of  the  body  of  a  type  definition  is  an  invocation  property  and  not  a  declaration 
property,  the  rule  for  parameterized  type  declarations  is  simpler  and  the  rule  for  instantiations  more 
complex,  quite  analogously  with  functions  and  procedures. 

7.3  SUMMARY 

This  concludes  the  clean  execution  definition  of  our  example  language.  These  last  two  chapters 
have  extended  the  rewrite  rule  approach  to  statement  sequences,  declarations,  and  parameterized 
declarations,  introducing  predicate  transformers,  substitution  functions,  and  pre/ postconditions  to  do 
so. 
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8.  THE  METHOD  AND  ITS  APPLICATION  TO  EUCLID 

In  this  chapter  we  consider  the  method  for  specifying  clean  execution  illustrated  in  the  previous 
three  chapters.  We  first  define  the  method  more  precisely,  and  then  address  clean  execution  problems 
encountered  in  defining  the  clean  execution  of  the  programming  language  Euclid  [Lampson  et  al.  77]. 
(Appendix  B  gives  a  clean  execution  definition  for  much  of  Euclid.)  We  then  discuss  how  clean  execu¬ 
tion  conditions  for  more  general  language  features  can  be  handled  with  the  approach  taken  for  static 
scoping  and  statically  determinable  types. 

8.1  THE  METHOD 

The  method  for  defining  clean  execution  for  a  language  is  conceptually  simple;  (1)  create  (or 
modify)  a  grammar  for  the  language,  and  (2)  starting  with  "program",  construct  rewrite  rules  for  each 
language  construct  that  check  the  operands  and  check  how  they  are  combined.  We  shall  make  these 
steps  more  precise  and  comment  for  each  step  on  matters  of  style  that  can  lead  to  a  better  definition. 

The  process  of  defining  clean  execution  is,  of  course,  not  sequential,  but  iterative.  First,  defining 
the  clean  execution  of  a  construct  may  cause  the  construct  to  be  redefined.  The  clean  execution  rule 
for  the  construct  must  in  turn  be  redefined,  and  so  forth.  Second,  although  a  grammar  for  a  construct 
must  exist  before  its  clean  execution  can  be  defined,  defining  clean  execution  can  cause  the  grammar  to 
be  revised. 

Step  1:  Constructing  a  Grammar 

Constructing  a  grammar  that  serves  as  the  basis  for  the  clean  execution  definition  is  an  important 
"first"  step.  The  only  requirement  on  the  clean  execution  grammar  for  it  to  be  correct  is  that  it  must  be 
consistent  with  other  grammars  for  the  language,  such  as  reference  grammars  or  grammars  used  in 
parsing. 

Although  it  is  easy  to  construct  a  clean  execution  grammar  that  is  correct  in  this  narrow  sense,  it  is 
much  harder  to  contruct  one  that  facilitates  writing  the  clean  execution  definition.  The  grammars  differ 
largely  in  focus.  A  non-terminal  in  a  reference  grammar  may  be  a  terminal  from  the  point  of  view  of 
clean  execution.  Additional  non-terminals  may  be  needed  to  capture  relevant  abstractions.  For  exam¬ 
ple,  a  rewrite  function  that  deals  with  parameter  lists  would  require  an  additional  non-terminal  if  the 
grammar  had  previously  been  defined  with  parameter  list  represented  only  as  a  sequence  of  parameters. 

The  clean  execution  grammar  is  only  a  basis  for  the  clean  execution  definition.  For  example,  in 
some  cases  it  is  convenient  to  expand  intermediate  non-terminals  to  expose  the  relevant  clean  execu¬ 
tion  concerns  for  a  particular  rewrite  function.  Also,  a  different  subgrammar  entirely  can  serve  as  the 
basis  for  a  rewrite  function.  For  example,  we  shall  see  below  where  a  right  recursive  subgrammar  is 
used  in  the  definition  of  overlapping  arguments  in  Euclid  instead  of  the  left  recursive  definition  used 
otherwise. 

It  has  proved  helpful  in  writing  clean  execution  rules,  both  for  the  example  language  and  for 
Euclid,  to  restrict  the  form  of  grammar  rules.  The  individual  rules  are  fairly  simple.  The  only  use  of 
alternatives  is  where  the  right-hand  side  of  a  rule  is  a  list  of  alternatives.  Separate  alternatives  are 
created  if  there  are  different  clean  execution  problems.  There  are  no  grammar  rules  where  an  alterna¬ 
tive  is  "empty".  Although  brackets  and  braces  are  used  for  optional  items  and  an  arbitrary  number  of 
items,  respectively,  these  are  seldom  nested. 
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^Step  2:  D^ne  Clean  Execution 

The  clean  execution  definition  consists  of  rewrite  functions,  substitution  functions,  some  definition 
of  the  meaning  of  programs  such  as  predicate  transformers,  and  inference  rules  for  functions,  pro¬ 
cedures,  and  loops.  The  rewrite  rules  are,  of  course,  the  heart  of  the  clean  execution  definition.  The 
others  are  introduced  only  as  need  be. 

Substitution  functions  are  closely  tied  to  the  programming  language’s  definition  of  scoping,  and  the 
forms  of  both  the  inference  rules  and  the  definition  of  the  meaning  of  programs  are  closely  tied  to  the 
verification  approach  taken.  These  ties  have  been  explored  above,  and  we  shall  not  recapitulate  them 
here. 

Each  rewrite  function  should  focus  on  a  single  language  construct  or  clean  execution  concern. 
Thus,  for  example,  initialized(varRen  should  not  have  to  be  concerned  whether  "varReT  is  a  valid 
reference.  That  is  for  validRef  to  check.  This  separation  of  concerns  is  critical  to  a  perspicuous 
definition. 

Defining  clean  execution  rules  is  conceptually  a  top-down  process:  Starting  with  clean(program), 
each  rewrite  function  is  defined  in  terms  of  rewrite  functions  for  the  component  constructs  until  ail  are 
expressed  in  terms  of  defined  rewrite  functions  and  of  assertions  in  the  underlying  assertion  language. 
As  discussed  in  Chapter  4,  conjuncts  checking  that  individual  operands  are  legal  must  precede  those 
that  ensure  that  operands  are  combined  legally.  Conventions  such  as  naming  conventions  and  the  use 
of  brackets  and  braces  in  rewrite  rules  have  proved  useful,  but  need  not  be  considered  part  of  the  for¬ 
malism. 

To  ensure  that  a  set  of  rewrite  rules  are  properly  defined,  we  must  establish  several  properties: 

(1)  Applying  the  rewrite  rules  in  an  arbitrary  order  to  clean( program)  always  terminates  (finite 
termination). 

(2)  The  rewrite  rules  always  yield  the  same  answer  (unique  termination). 

(3)  The  answer  is  in  the  underlying  assertion  language. 

These  properties  follow  from  constraints  on  the  way  rewrite  functions  are  constructed: 

(1)  A  rewrite  function  must  be  total  in  the  following  sense:  it  must  handle  all  possible  inputs 
of  the  specified  grammatical  types  that  meet  statically  enforceable  constraints.  (This 
ensures  that  if  the  process  of  applying  the  rewrite  rules  terminates,  the  result  will  be  in  the 
underlying  assertion  language.) 

(2)  A  rewrite  function  must  be  deterministic.  (That  is,  the  rules  for  a  rewrite  function  must 
be  disjoint.) 

(3)  Rewrite  functions  must  not  interact.  (This  ensures  that  the  order  of  rule  application  does 
not  matter.  Together  with  the  previous  requirement,  this  also  guarantees  unique  termina¬ 
tion.) 

Given  these  restrictions,  it  remains  only  to  show  that  there  is  no  infinite  chain  of  rewrite  functions 
possible.  Again  this  is  a  matter  of  construction: 

(4)  The  rewrite  functions  must  be  designed  so  that  some  measure  of  the  assertions  is  well- 
founded. 
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This  is  not  difficult  since  almost  all  rewrite  rules  are  length  reducing  in  the  length  of  their  argu¬ 
ments.  Let  us  first  consider  the  case  when  all  rewrite  functions  are  length  reducing.  Let 
LENGTH (programConstruct)  be  the  number  of  programming  language  symbols  in  "programConstruct". 
Let  k,  magically  enough,  be  the  integer  one  greater  than  the  product  of  (1)  the  maximum  number  of 
arguments  a  rewrite  function  takes  and  (2)  the  maximun  number  of  distinct  subassertions  in  the  right 
hand  side  of  any  rewrite  rule.  The  following  cost  function  is  a  measure  of  assertions  that  decreases  for 
each  application  of  a  rewrite  rule: 

cost(asserlion  binBoolOpr  assertion! ) 

“  cost(assertion)  -h  cost(assertionl) 

costt--  assertion)  =■  cost(assertion) 
cost(V  assertion)  —  cost(assertion) 

cost(TRUE)  =  0 
cost(  FALSE)  =  0 
cost(expr  compOpr  exprl)  =*0 
cost(IDEQ(id,  id!))  =  0 

cost(expr  IN  type)  =■  cost(type) 

cost(programText)  =*  LENGTH (programText) 

cost(rewriteFunction  ’(’  argument  argument!}  ’)’) 

*  (k  **  cost(argument))  {*  k  **  cost(argument!)} 

The  key  to  this  measure  is  the  cost  of  a  rewrite  function  invocation.  Essentially,  only  rewrite  functions 
have  cost.  The  choice  of  "k”  ensures  that  the  right-hand  side  of  a  rewrite  rule  always  has  lower  cost 
than  that  of  the  left-hand  side. 

This  measure  can  be  extended  to  rewrite  rules  that  do  not  increase  the  length  of  their  arguments, 
such  as  the  cleanE(varRef)  rule.  One  way  of  doing  this  is  to  assign  weights  to  each  rewrite  function 
and  weight  the  product  in  the  last  rule  of  "cost"  above. 

Step  3:  Repeat  Until  Done 


8.2  CLEAN  EXECUTION  ISSUES  IN  EUCLID 
Overlapping  Parameters 

The  Euclid  Report  bans  overlapping  arguments.  This  ban  is  specified  in  the  clean  execution 
definition  as  noOverlap(exprList).  To  define  noOverlap,  we  shall  first  reduce  the  property  about  a  vec¬ 
tor  of  arguments  to  one  about  each  argument  not  overlapping  with  all  following  arguments.  We  shall 
then  express  this  property  in  terms  of  pairs  of  arguments  not  overlapping.  (To  focus  on  the  interesting 
part  of  the  problem,  we  shall  assume  that  we  are  dealing  with  a  parameter  list  of  only  variable  parame¬ 
ters.) 

noOverlap:  VarList  — >  Assertion 


noOYerlap(  varR  eO 


TRUE 
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noOverlap(varRef  varRefList) 
noO(varRef,  varRefList) 

noO:  VarRef  x  VarRefList  — >  Assertion 

noO(varRef,  varRefl)  overlap(varRef,  varRefl) 

noO(varRef,  varRef!  varRefList) 

-•  overlap(varRef,  varRefl) 

&  noOCvarRef,  varRefList) 

We  now  must  define  what  it  means  for  two  variable  references  to  overlap.  Two  variable  references 
overlap  if  they  are  part  of  the  same  main  variable  and  if  corresponding  indices  are  the  same.  In  Euclid, 
since  there  are  no  array  slices,  this  is  more  simply  stated:  two  variables  overlap  if  one  reference  is  an 
initial  segment  of  the  other.  As  a  result,  it  is  more  convenient  to  use  a  right  recursive  definition  of 
"varRer  than  the  left  recursive  one  used  elsewhere. 

varRef  =  varld  [indices] 
indices  =  index  [indices] 
index  —  ’(’  expr  ’)’ 

I '(’  pointer  ’)’ 

I  fieldid 

To  be  complete,  we  should  prove  that  this  and  our  earlier  grammar  define  the  same  language, 
overlap  is  now  defined  as: 

overlap:  VarRef  x  VarRef  —  >  Assertion 

overlap( varld,  varldl  [indices])  IDEQ(varId,  varldl) 

overlap(varId  indices,  varldl)  IDEQ(varId,  varldl) 

overlap(varId  indices,  varldl  indicesl) 

IDEQ(varId,  varldl) 

&  indicesOveriap(indices,  indicesl) 

indicesOverlap:  Indices  x  Indices  — >  Assertion 

indicesOverlap(index,  indexl  [indices]) 

IndexEquaKindex,  indexl) 

indicesOverlap(index  indices,  indexl) 

IndexEquaKindex,  indexl) 

indicesOverlap(index  indices,  indexl  indicesl) 

IndexEquaKindex,  indexl) 

&  indicesOverlap(indices,  indicesl) 

indexEqual:  Index  x  Index  ->  Assertion 

indexEqualCC  expr  ’(’  expri  ')’)  —  expr  -  exprl 

IndexEquaK'C  pointer  poinierl  ’)’) 
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“*■  pointer  —  pointerl 
indexEquaK'.'  fieldid,  fieldidl) 
—  IDEQ(fieldId,  fieldidl) 


The  first  two  productions  of  the  rewritten  "varReP  could  be  expressed  more  concisely  as 
varRef  =*  varld  (index) 

Using  both  the  non-terminals  "index"  and  "indices",  however,  allows  a  cleaner  statement  of  the  clean 
execution  rules. 

Referencing  an  Uniniiialized  Variable 

The  Euclid  Report  [Lampson  et  al.  77]  does  not  forbid  referencing  an  uninitialized  variable.  The 
Euclid  designers  felt  that  verification  requires  stronger  properties  to  be  proved.  An  InRange{x)  predi¬ 
cate  has  since  been  added  to  the  language  [Holt  et  al.  78],  however,  to  allow  other  clean  execution  con¬ 
straints  to  be  enforced. 

Mixing  Signed  and  Unsigned  Integers 

Euclid  includes  both  signed  and  unsigned  integer  types.  Languages  in  the  past,  including  BCPL 
[Richards  69]  and  Mesa  [Mitchell  et  al.  78],  have  often  had  problems  in  properly  providing  mixed 
mode  arithmetic.  These  problems  usually  result  from  the  conflict  between  making  the  capabilities  of 
the  underlying  hardware  available  in  the  programming  language  and  not  wanting  to  check  at  run-time 
which  type  of  arithmetic  to  use.  (A  nice  discussion  of  the  problems  in  Mesa  appears  in  [Wirth  77c].) 

The  Euclid  designers  attempted  to  set  out  minimal  capabilities  that  a  Euclid  implementation  must 
provide  by  defining  when  an  arithmetic  operation  is  well-behaved.  The  definition  of  "well-behaved", 
however,  was  one  of  the  most  consistently  criticized  parts  of  successive  drafts  of  the  Euclid  Report.  As 
illustrated  by  the  definitions  of  "well-behaved"  and  the  supporting  notion  "extended  range  operand"  in 
Appendix  B,  rewrite  rules  provide  a  natural  and  precise  notation  for  providing  such  definitions. 

Generic  Operators 

A  formalization  problem  is  posed  by  generic  operators  such  as  "+",  "-",  and  "*",  which  are  both 
arithmetic  and  set  operators.  Since  the  rewrite  rule  process  is  tied  to  the  structure  of  an  expression,  the 
same  rule  must  apply;  yet  obviously  the  "same"  rule  cannot  apply. 

The  solution  is  a  clean  execution  condition  that  is  conditional,  with  a  clause  for  each  type  of  opera¬ 
tor.  For  example,  consider  the  ItsType  rule: 

ItsType(expr  ambiguousOpr  exprl) 

IF  IntegerType?(ItsType(expr)) 

THEN  ’Integer’ 

ELSE  ItsType(expr)  FI 

(As  was  noted  earlier,  the  IF  operator  for  assertions  is  just  syntactic  sugar.) 

IntegerType?  answers  a  question  about  the  underlying  type: 
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Integer! ype?:  Type  ->  Assertion 
r  simple  types  */ 

IntegerType?(’Integer’)  TRUE 
IntegerType?(’SignedInt’)  TRUE 
IntegerType?(’Unsignecllnt')  TRUE 

IntegerType?(’AddressType’)  TRUE 

IntegerType?('Boolean’)  FALSE 

IntegerTyp€?(’Char’)  FALSE 
IntegerType?(’StorageUnit’)  FALSE 

IntegerType?(enumeratedType)  FALSE 

IntegerType?(cons consl)  IntegerType?(ItsType(cons)) 

/•  structured  types  */ 

IntegerType?(structuredType)  FALSE 


8.5  DYNAMIC  SCOPING  AND  BINDING 

We  restricted  our  example  language  to  static  scoping  and  objects  with  statically  determinable  types. 
This  included  the  execution  errors  of  such  languages  as  Fortran,  Algol  60,  Pascal,  Euclid,  and  Modula. 

With  dynamic  scoping  and  dynamic  binding  other  execution  errors  become  possible.  For  example, 
we  can  no  longer  check  statically  whether  an  array  value  will  be  assigned  to  a  scalar  variable,  whether  a 
character-valued  argument  will  be  passed  to  a  Boolean  parameter,  whether  a  function  invocation  will 
have  the  right  number  of  arguments,  whether  variables  will  be  assignable  as  need  be,  whether 
identifiers  will  be  visible  as  need  be,  or  even  whether  a  variable  has  previously  been  declared. 

These  execution  constraints  share  a  property  in  common;  in  each  case  the  clean  execution  condi¬ 
tion  cannot  in  general  be  expressed  as  a  constraint  on  the  values  of  program  variables.  Rather,  a  con¬ 
straint  on  the  attributes  of  variables  is  needed. 

One  way  to  provide  such  constraints  is  to  augment  the  assertion  language  with  assertions  about 
attributes.  A  clean  execution  condition  for  one  of  these  constraints  would  be  expressed  as  such  a  term, 
and  each  declaration  would  provide  the  attribute  information  to  discharge  any  such  assertions. 

Alternatively,  such  constraints  can  be  handled  within  the  existing  framework.  Additional  derived 
attributes  would  be  required.  These  would  serve  the  same  role  as  the  assertions  about  attributes  above, 
but  would  be  defined  by  rewrite  functions. 
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9.  CONCLUSIONS 


9.1  SUMMARY 

This  thesis  provides  a  method  for  defining  clean  execution.  This  method  uses  rewrite  rules  organ¬ 
ized  as  assertion  functions.  The  definition  is  recursive,  based  on  a  grammar  that  differs  from  other 
grammars  for  the  language  largely  in  its  focus  on  clean  execution  issues.  Essentially,  an  assertion  func¬ 
tion  corresponding  to  each  non-terminal  of  the  programming  language  translates  a  construct  of  the 
appropriate  syntactic  type  to  a  term  in  the  assertion  language. 

Key  to  specifying  these  assertion  functions  are  declared  attributes  of  variables,  constants,  types, 
and  routines.  Declared  attributes  are  an  unambiguous  way  of  specifying  structural  information  given  in 
the  corresponding  declaration,  such  as  the  type  of  a  variable  or  the  parameter  list  of  a  procedure.  For 
each  declared  attribute  there  is  a  substitution  function  that  associates  the  declared  information  with  its 
uses,  taking  scoping  into  account. 

Statement  sequences,  declarations  composed  with  their  associated  scopes,  and  parameterized 
declarations  (procedure  and  function  declarations,  loops  and  parameterized  types)  cannot  be  defined  by 
rewrite  rules  alone.  We  extended  the  rewrite  rule  approach  to  these  constructs,  using  predicate 
transformers,  substitution  functions,  and  pre/postconditions  to  do  so. 

9.2  OBSERVATIONS  ON  THE  METHOD 
Importance  of  the  Grammar 

Although  a  clean  execution  grammar  obviously  borrows  from  other  grammars  for  the  language,  its 
design  is  surprisingly  important.  A  problem  in  writing  a  crisp  definition  of  clean  execution,  such  as  not 
being  able  to  separate  concerns  properly,  was  quite  often  the  result  of  decisions  about  the  grammar. 
What  are  usually  matters  of  taste  in  designing  grammars,  such  as  recursive  versus  iterative  definition, 
not  only  influence  the  ease  of  writing  clean  execution  rules,  but  also  can  expose  or  hide  clean  execution 
concerns. 

Embed  Clean  Execution  Conditions  in  the  Grammar? 

Given  the  intimate  relationship  between  the  grammar  and  its  clean  execution  definition,  we  might 
want  to  embed  clean  execution  conditions  directly  in  the  grammar  rather  than  maintaining  separate, 
parallel  definitions.  There  are  several  disadvantages  to  this,  however.  First,  to  speak  of  a  single  gram¬ 
mar  is  somewhat  misleading.  The  grammar  is  only  a  basis  for  the  clean  execution  definition.  A  rewrite 
function  may  use  a  different  subgrammar  or  expand  the  basis  grammar  to  expose  relevant  non¬ 
terminals.  Second,  in  any  case  there  must  be  some  formalism  for  defining  functions  such  as  "Index- 
Type"  and  "valueParmsInitd".  Only  the  "clean"  functions  can  be  embedded  in  a  grammar. 

Rewrite  Rule  Method  as  Specification  Technique 

Implicit  in  the  application  of  the  rewrite  rule  approach  is  the  hard  problem  of  deciding  when  execu¬ 
tion  errors  can  (or  should)  occur.  Software  engineering  principles  are  as  important  in  defining  clean 
execution  as  in  defining  programs.  In  order  to  have  a  clear,  easily  extended  definition,  modularization 
and  proper  choice  of  abstractions  (rewrite  functions)  are  critical. 
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9.3  CONTRIBUTIONS 

This  thesis  provides 

(1)  the  first  tool  for  formally  defining  clean  execution  conditions.  The  method,  based  on 
rewrite  rules,  allows  sufficient  conditions  to  be  defined  that  ensure  clean  execution.  This 
provides  a  more  complete  language  definition,  closing  the  language  definition  gap  between 
syntactic  correctness  and  correctness. 

(2)  a  notation  for  documenting  design  decisions  about  execution  constraints. 

(3)  a  discipline  for  considering  clean  execution  during  language  design.  As  in  defining  proof 
rules,  constructing  clean  execution  rules  serves  as  a  useful  discipline  that  makes  it  neces¬ 
sary  to  be  explicit  about  the  interactions  of  language  features.  Since  the  language  designer 
must  define  the  semantics  of  ail  cases  not  excluded  by  the  clean  execution  definition,  he  is 
careless  in  using  this  discipline  at  his  own  peril. 

(4)  better  tools  for  verifying  clean  execution.  Applying  the  formalism  traces  errors  back  in  the 
‘program  text,  possibly  discovering  preconditions  on  a  routine's  legai  invocation.  More¬ 
over,  even  if  a  verification  fails,  it  can  expose  program  faults,  such  as  implicit  assumptions 
about  how  a  routine  is  used  or  degenerate  cases  of  data  structures  or  loops. 

(5)  clean  execution  conditions  for  much  of  the  programming  language  Euclid. 

9.4  FUTURE  DIRECTIONS 

There  are  several  ways  in  which  this  work  could  be  extended  to  test  its  wider  applicability.  First,  if 
the  method  is  to  be  judged  a  success,  it  should  provide  a  convenient  notation  for  expressing  execution 
constraints  such  as  those  on  non-determinism,  parallelism,  machine  arithmetic,  and  more  sophisticated 
notions  of  type  compatibility.  Work  is  needed  both  in  exploring  the  suitability  of  this  method  for  such 
problems  and  in  using  the  method  to  provide  a  precise  specification  of  the  necessary  constraints. 

Second,  hidsight  is  20-20.  The  method  can  be  fully  tested  as  a  vehicle  for  designing  and  communi¬ 
cating  clean  execution  definitions  only  as  part  of  a  language  design  effort. 

Third,  both  the  processes  of  defining  clean  execution  rules  and  of  deriving  clean  execution  condi¬ 
tions  could  be  better  explored  with  mechanical  help.  A  system  could  check  a  clean  execution  definition 
as  it  is  given,  ensuring  that  proper  rewrite  rules  are  given,  the  rules  are  consistent  with  the  grammar,  a 
rewrite  function  has  rules  for  all  cases  of  its  domain,  and  the  rules  so  far  converge.  A  system  could 
also  support  such  tasks  as  identifying  parts  of  the  existing  definition  affected  by  proposed  changes  to  the 
grammar. 
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APPENDIX  A:  A  CLEAN  EXECUTION  GRAMMAR  FOR  EUCLID 


This  appendix  contains  the  grammar  of  the  programming  language  Euclid  that  is  the  basis  of  the 
clean  execution  definition  given  in  Appendix  B.  The  organization  of  this  grammar  follows  roughly  that 
of  the  Euclid  Report  (Lampson  et  al.  77],  and  differs  from  that  grammar  largely  in  its  focus  on  clean 
execution  issues.  Such  a  grammar  is  an  important  first  step  in  writing  a  clean  execution  definition  of  a 
language. 

The  main  departures  from  the  Report’s  grammar  are 


(1)  A  number  of  the  non-terminals  in  the  Report’s  grammar  serve  as  terminals  in  the  clean 
execution  grammar,  since  no  further  expansion  is  needed  to  define  clean  execution. 

(2)  Euclid  standard  components  have  been  added  to  the  clean  execution  grammar.  Although 
standard  components  play  an  important  role  in  Euclid,  the  Report’s  grammar  makes  no 
mention  of  themi.  A  number  of  these  standard  comsponents  correspond  directly  to  func¬ 
tions  that  are  central  to  the  clean  execution  definition.  Standard  function  designators  and 
standard  component  function  designators  have  also  been  added  to  the  grammar  as  possible 
function  values. 


(3)  The  treatment  of  constants  in  the  Report’s  grammar  is  inadequate  and  is  expanded  here. 

(4)  The  expression  grammar  given  has  no  precedence  of  operators.  This  could  be  added  very 
straightforwardly,  adding  only  detail  to  the  clean  execution  rules.  Moreover,  all  assertion 
and  substitution  functions  dealing  with  expressions  would  have  to  contend  with  this  detail. 
Although  the  resulting  grammar  is  ambiguous,  the  ambiguities  present  no  problems. 

(5)  The  semicolon  silliness  of  [Lampson  et  al.  77]  has  been  done  away  with  by  perhaps  Pro¬ 
crustean  means.  No  semicolons  are  to  be  inserted,  and  no  semicolons  are  optional.  This 
allows  us  to  ignore  an  irrelevant  source  of  complexity. 


(6)  Identifier  lists  in  declarations  and  formal  parameter  lists  must  have  been  rewritten  so  that 
each  identifier  has  a  separate  declaration.  This  rewriting  is  trivial  and  makes  the  presenta¬ 
tion  of  the  clean  execution  definition  much  crisper. 

(7)  Productions  have  been  rewritten  to  eliminate  all  but  one  occurrence  of  ’empty’  and  all  but 
one  production  of  the  form  Ihs  —  [rhs].  This  quite  often  makes  the  intent  of  the  produc¬ 
tions  clearer,  and,  more  importantly,  eliminates  any  options  as  to  what  is  to  be  rewritten. 


The  following  abbreviations  are  used  throughout  this  grammar; 
cons  -  constant 
dcln  -  declaration 
eS  -  executableScope 
expr  -  expression 
id  -  identifier 
opr  -  operator 
parm  -  parameter 
ref  -  reference 
var  -  variable 
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EXECUTABLE  SCOPES  AND  DECLARATIONS 

eS  —  [checkedClause]  [dcin  statement 

dcln  =  dclnPart  dcinPart} 

dcInPart  ■■  typeDcln 
I  consDcln 
I  varDcIn 
I  varBindingDcln 
I  procedureDcln 
1  functionDcln 
I  assert  assertion 

assertion  =*  [’(’  expr  ’)’] 


TYPES 

typeDcln  —  type  iypQld  [typeFormalList]  ’=*’ 

[pre  assertion]  typeDefinition 

typeFormalList  —  ’(’  id  type  id  type)  ’)’ 

typeDefinition  —  type  \  forward 

type  =“  simpleType  |  structuredType 

simpleType  —  standardSimpleType 
1  enumeratedType 
I  subrangeType 
1  namedType 

StandardSimpleType  **  ’Integer’  1  ’Signedint’ 

I  ’Unsignedint’  |  ’Boolean’  |  ’Char’ 

I  ’StorageUnit’  |  ’AddressType’ 

enumeratedType  —  ’(’  enumeratedValueld  {’,’  enumeratedValueld)  ’)’ 
subrangeType  =*  cons  cons 


namedType  *  [containingVar  ’.’]  typeld  [typeActualList] 
I  ref  ’.  ItsType’ 

I  type  ’.  IndexType’ 

I  ref  ’.  IndexType’ 

I  type  ’.  ComponentType’ 

I  ref  ’.  ComponentType’ 

I  type  ’.  BaseType’ 
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I  ref  BaseType’ 

I  varRef  ObjectType’ 

typeActualList  *  typeActualParm  typeActualParm}  ’)’ 
typeActualParm  =“  cons  |  any 


structuredType  **  [packech  unpackedStructuredType 
i  namedType 

unpackedStructuredType  =•  arrayType  I  recordType 
I  moduleType  |  setType  |  collectionType 
1  pointerType 

arrayType  *  array  indexType  of  componentType 

indexType  =*  simpleType 

componentType  =*  type 

recordType  *  record  fieldList  endRecord 

fieldList  “  [recordDcln] 
i  [recordDcln]  case  tag 
[defauh  manifestCons] 
o/variant  variant] 

[otherwise  ’  *  >  ’  recordDcln]  end  case 

recordDcln  *  recordDclnPart  recordDclnPart] 

recordDclnPart  =  consDcln  |  varDcln 

variant  =*  caseLabelList  ’=■>’  recordDcln  ewtf  caseLabel 

moduleType  *•  module  [id]  [importClause]  [exportClause] 
moduleBody  endModule 

moduleBody  *  [checkedClause]  [dcln] 

[initially  routineDefinition  ’;’] 

[invariant  assertion  ’;’] 

setType  “  set  o/baseType 

baseType  “  simpleType 

collectionType  —  [countControl]  collection  of  [/>?  zoneld] 

ObjectType  =*  type 


pointerType 


collectionVarld 
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•  VARIABLES 

varDcIn  *  var  id  [fixedAddress]  type 

varBindingDcln  *  bind  varBinding 

I  bind'V  varBinding  {’/  varBinding}  ’)' 

varBinding  =■  IvarBindingCondilion]  id  lo  varRef 

varRef  *  varld 

I  varRef  ’(’  expr  ’)’ 

I  varRef  ’('  pointer 
I  containingVar  fieldid 

pointer  —  varRef 
1  functionDesignator 
1  functionDesignatorComponent 

containingVar  =■  varRef 


CONSTANTS 

consDcln  *  const  \d  type)  cons 

1  const  id  type  structuredCons 

cons  *  literalCons  |  consRef 
I  consStandardComponent  |  expr 

literalCons  “  unsignedNumber  1  literalstring 
I  literalChar  1  enumeratedValueld 

literalstring  *  {extendedCharacterj 

consRef  —  consid 
I  consRef  ’(’  expr  ’)’ 

I  consRef  fieldid 
I  containingVar  fieldid 
1  recordType  fieldid 

consStandardComponent  *  type  size’ 

I  ref  size’ 

I  type  ’.  alignment’ 

I  ref  ’.  alignment’ 

I  simpleType  ’.  first’ 

I  ref  ’.  first’ 

I  simpleType  ’.  last’ 

I  ref  ’.  last’ 

I  ’StorageUnit .  sizeInBits’ 

1  varRef  ’.  sizeInBits’ 

I  varRef  ’.  address’ 
j  ref  ’.  itsTag’ 
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1  collectionVarld  nil’ 

1  collectionVarld  ’(’  pointer  ’)  .  refCount’ 

ref  *  varRef  1  consRef 


EXPRESSIONS 

expr  *  varRef 
I  cons 

I  functionValue 
!  ’(’  expr  ’)’ 

I  selTypeld  ’(’  [element  element}]  ’)’ 
I  expr  setOpr  expr 
I  expr 

1  expr  arithmeticOpr  expr 
1  expr  comparisonOpr  expr 
1  expr  [not]  in  simpieType 
I  not expr 

1  expr  binaryBooleanOpr  expr 
element  =*  expr  1  simpieType 
setOpr  -  ’  +  ’ I I  xor 
ambiguousOpr  —  |  | 

arithmeticOpr  —  addingOpr  j  multiplyingOpr 
addingOpr  *  ’-f  ’  | 
multiplyingOpr  =■  |  div  \  mod 

relationalOpr  —  [not]  m i  comparisonOpr 
comparisonOpr  =*  [not]  ’=*’ i  inequalityOpr 
inequalityOpr  =■ 

binaryBooleanOpr  =*  |  or  \  and 


FUNCTION  VALUES 

functionValue  =■  functionDesignator 
I  functionDesignatorComponent 
I  standardFunctionDesignator 
I  typeConverterDesignator 

functionDesignator  —  function  [’(’  exprList  ’)’] 


function  *■  [containingVar  funclionid 
exprList  —  expr  expr) 

functionDesignaiorComponent  —  functionBase  ’(’  expr 
1  functionBase  fieldid 

functionBase  functionDesignator 
1  functionDesignatorComponent 

standardFunctionDesignator  *  siandardFunction  '(’  expr  ’)’ 

I  simpleType  standardComponentFunction  ’(’  exprList  ’)’ 

I  ref  StandardComponentFunction  ’(’  exprList 
I  collectionVarld  Index  (’  pointer  ’)’ 

standardFunction  =*  ’Abs’  [  ’Odd’  1  ’Chr’ 

StandardComponentFunction  *  ’Succ’  1  ’Pred’  \  ’Ord’  1  ’Max’  ]  ’Min’ 
typeConverterDesignator  —  typeConverterld  ’(’  expr  ’)’ 


ROUTINE  DECLARATIONS 

procedureDcln  *  [inline]  procedure  id  [formalParmList] 

’*’  routineDefinition 

functionDcln  *  [inline]  function  id  [formalParmList] 
reTurns  \(i  type  ’  — ’  routineDefinition 
I  typeConverterDeclaration 

formalParmList  *  ’(’  formalSection  {’,’  formalSection)  ’)’ 

formalSection  ^  bindingCondition  id  ’:’  type 

routineDefinition  =*  [importClause]  [pre  assertion] 

[post  assertion]  routineBody 

importClause  *  imports  importList  {’;’  imports  importList] 

importList  =*  ’(’  bindingCondition  id  {’,’  bindingCondition  id)  ) 

routineBody  =*  begin  eS  end  id 
1  forward 


STATEMENTS 
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.  statement  =*  statement  statement 
I  varRef  ’  expr 

I  [containingVar  procedureld  [’(’  exprList  ’)’] 
1  exit 
I  return 

1  assert  assertion 
I  begin  eS  end 

I  //expr  then  eS  [else  eS]  end  if 
1  case  expr  of  caseBody  end  case 
I  case  [consii  parm  expr 
of  caseBody  end  case 
1  case  varBindingCondition 
parm  bound  to  varRef 
of  caseBody  end  case 
I  loop  eS  end  loop 
I  empty 

caseBody  =  [caseLabelList  ’==>’  eS  caseLabe! 

caseLabelList  ’=>’  eS  end 
caseLabel)]  otherwise  eS] 


PROGRAMS 

program  typeDcln  typeDcln} 
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appendix  B:  A  CLEAN  EXECUTION  DEFINITION  FOR  MUCH  OF  EUCLID 


This  appendix  gives  clean  execution  rules  for  much  of  the  programming  language  Euclid.  Not 
included  are  machine  dependencies,  storage  allocation,  forward  declarations,  modules,  type  compatibil¬ 
ity,  and  some  auxiliary  functions. 

This  clean  execution  definition  is  based  on  the  grammar  for  Euclid  in  Appendix  A,  which  should 
be  read  in  conjunction.  The  definition  assumes  that  the  program  text  meets  the  statically  enforceable 
constraints  defined  in  the  Euclid  Report. 

The  following  constructs  are  defined  in  the  Euclid  Report  by  translation  into  other  legal  Euclid  con¬ 
structs: 

(1)  variable  references  of  the  form:  pointer 

(2)  final  actions  in  modules, 

(3)  initializations  in  variable  declarations, 

(4)  conditional  escape  statements  (uses  of  when), 

(5)  return  statements  that  return  values, 

(6)  elself  clauses, 

(7)  for  statements, 

(8)  uses  of  parameter, 

(9)  pervasive,  and 

(10)  implicit  imports  through  //;ws  lists. 

(The  last  two  of  these  do  not  influence  clean  execution.)  These  translations  must  previously  have  been 
applied  before  the  clean  execution  rules  (or  the  Euclid  Proof  Rules  [London  et  al.  78])  can  be  applied. 
Also,  the  Euclid  Report  [Section  11]  states  that  in  a  function  declaration  a  "return  statement  without 
any  value  is  supplied  automatically  just  before  the  end  of  the  body".  The  rules  below  assume  that 
return  statements  have  been  so  supplied. 

The  following  conventions  are  observed  in  the  clean  execution  rules  below: 

(1)  Basic  symbols  of  Euclid  are  italicized  if  they  are  alphabetic  and  enclosed  in  single  quotes  if  not. 
(Euclid’s  single  quote  is  represented  as  a  double  quote  mark  since  Euclid  has  no  double  quote  mark.) 
As  a  shorthand  for  strings  of  quoted  symbols,  only  leading  and  trailing  single  quotes  need  be  given. 
Thus,  ’nil’  is  abbreviated  ’.  nil’. 

(2)  Identifiers  that  appear  free  in  the  rewrite  rules  are  implicitly  universally  quantified  over  all  valid 
strings  of  the  corresponding  syntactic  unit.  A  free  identifier  is  either  the  same  as  the  name  of  the  syn¬ 
tactic  unit,  or  is  the  name  of  the  syntactic  unit  followed  by  a  single  decimal  digit.  By  subscripting 
identifiers  in  this  way,  different  occurrences  of  a  syntactic  unit  can  be  distinguished  on  the  right  hand 
side  of  a  rule,  as  in 

cIeanE(expr  ’/’  exprl) 

cleanE(expr)  &  cleanE(exprl) 

&  exprl  ^  0 

&  expr  ’/’  exprl  IN  Signedint 


(3)  Two  conventions  used  in  the  grammar  have  fairly  natural  analogs  on  the  right  hand  sides  of 
rewrite  rules.  In  the  grammar,  possible  omission  of  a  construct  is  indicated  by  enclosing  the  construct 
within  metabrackets  ’[’  and  ’]’.  In  the  rewrite  rules,  an  optional  portion  of  the  pattern  on  the  left  hand 
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'dean:  ES  —  >  Assertion 


note.  A  Euclid  program  consists  of  a  series  of  module  type  declarations.  The  Euclid  Report 
"does  not  specify  how  the  module  types  declared  in  programs  are  instantiated  to  start  a 
program".  The  module  types  at  the  outermost  level  are  restricted  in  form:  for  example,  it 
makes  no  sense  for  them  to  be  parameterized  and  the  only  reasonable  precondition  is 
TRUE.  There  is  no  advantage  (or  need),  however,  to  providing  a  separate  rewrite  function 
for  modules  at  the  outermost  level,  dean,  which  handles  executable  scopes,  can  thus 
check  programs. 

The  optional  include  clause  that  is  part  of  a  program  is  just  an  alternative  source  of 
(possibly  precompiled)  module  type  declarations.  We  assume  here  that  uncompiled 
module  type  declarations  are  included  before  applying  dean  and  that  precompiled  module 
type  declarations  were  checked  before  being  compiled. 


dean(checkedClause  eS)  dean(eS) 

dean(co«snd  type  cons  eS) 

deanT(type)  &  deanE(cons) 

&  asgmtTC(type,  cons) 

&  ESub<cons,  id, 

TSub<type,  TYPE<id>,  dean(eS)  >> 

dean(cons/  id  ’  cons  eS) 
deanE(cons) 

&  ESub<cons,  id, 

TSub<ItsType(cons),  TYPE<id>,  dean(eS)  >> 

dean ( CO /7S/ id  type  structuredCons  eS) 

deanT(type)  &  asgmtTC(type,  structuredCons) 

&  ESub<  structuredCons,  id, 

TSub<type,  TYPE<id>,  dean(eS)  >> 

note.  Rhetorical  question:  Doesn’t  this  create  structured  constant  references  such  as  (1,2,3)  (i), 
which  are  not  legal  Euclid.  Rhetorical  answer  (in  two  parts):  For  run-time  checking,  a 
clean  execution  condition  can  be  associated  with  the  potentially  offending  construct.  Thus, 
such  structured  references  cannot  occur.  For  verification,  such  references  must  either  by 
ESub  or  be  included  in  the  assertion  language. 

dean(vflrid  [fixedAddress]  type  eS) 
deanT(type) 

&  (V  id)  TSub<type,  TYPE<id>,  dean(eS)  > 

cle^nibind  [varBindingConditionl  id  to  varRef eS) 
validRef( varRef) 

8t  ESub  <  varRef,  id, 

TSub< ItsType( varRef),  TYPE<id>,  dean(eS)  >> 

dean(^/W’(’  [varBindingConditionl  id  /o  varRef 

[varBindingConditionl]  idl  ro  varRefl}  ’)  eS)) 
yalidRef(varRef)  (&  vtlidRef(varRen)} 
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&  noOverlap(varRef,  {varRefl))  ' 

St  ESub<varRef,  id, 

TSub<  ItsType(varRef),  TYPE<id>, 

{ESub< varRefl,  idl, 

TSub< ItsType(varRen),  TYPE<idl>,} 
ciean(eS)  {>  > }  >  > 

dean(/y/?e  id  '=■’  [pre  assertion]  type  scope) 
cleanT(type) 

St  PreSub< assertion,  TRUE  [&  PRE<id>], 

TSub<type,  id,  clean(scope)  >> 

note.  The  clean  execution  rules  for  type,  procedure,  and  function  declarations  assume  that  clean 
execution  of  the  body  is  a  declaration  property.  The  changes  needed  if  legality  is  an 
instantiation  property  are  discussed  in  Chapter  7. 

note.  If  no  precondition  is  specified,  the  Report  specifies  that  the  precondition  is  TRUE. 

cle^nitype  id  formalParmList 

’  =  ’  assertion]  type  scope) 
deanFPL(formalParmList) 

&  ([assertion  &]  valueParmsInitd(formalParmList) 

">  parmTypes(formalParmList,  deanT(type))) 

St  PreSub< assertion,  TRUE  [St  PRE<id>], 

PTSub<type,  id  ’(’  formalParmList  ’)’, 

PSub< formalParmList,  PARMS<id>, 
clean  (scope)  >  >  > 

deandm/md  procedure  \(1  formalParmList 
[importClause]  [pre  assertion] 

[posr assertion!]  routineBody  scope) 
deanFPL(formalParmList) 

Si  ([assertion  &]  YalueParmsInitd(formalParmList) 

“>  parmTypes(formalParmList, 

PreSub< assertion,  PRE<id>, 

PSub<  formalParmList,  PARMS<id>, 
dean(routineBody)  >>) 

St  PreSub< assertion,  PRE<id>, 

PSub<  formalParmList,  PARMS<id>, 
dean  (scope)  >  > 

clestnilinline]  function  id  formalParmList 
returns  idl  type 
’  — ’  [importClause]  [pre  assertion] 

[post  assertion  1]  routineBody  scope) 
deanFPL(formalParmList) 

Si  deanT(type) 

Si  ([assertion  &]  valueParmsInitd(formalParmList) 

*>  parmTypes(formalParmList, 

TSub(type,  TYPE<idl>, 

PreSub< assertion,  PRE<id>, 

PSub<  formalParmList,  PARMS<id>, 
dean(routineBody  idl  idl)  >>) 


•li¬ 


st  PreSub<  assertion,  PRE<id>, 

PSub<formalParmList,  PARMS<id>, 
clean(scope)  >  > 


clean  assertion)  cleanA(assertion) 

clean(stmt  stmti) 

clean(stmt)  St  wp<stmt,  clean(stmtl)  > 

clean(varRef  expr) 

— validRef(varRef)  St  cleanE(expr) 

St  asgmtTC{expr,  ItsType(varRef)) 

clean([containingVar  procid  [’(’  exprList  ’)’])  • 
[validRef(containingVar)  &] 
valld?arii5s(exprLisl,  ?ARMS<  procId>) 

Si  pariiiInfo(exprList,  ? ARMS <  procid >,  PRE<procld>) 

dean(£x/7)  TRUE 

cle^aireturn)  “*■  TRUE 

clesini beg i n  qS  end)  clean(eS) 

clean(//expr  then  tS  [e/seeSl]  end  ij) 
cleanE(expr) 

St  (expr  St  clean(eS)  j-*  expr  St  clean(eS)) 

synonym  cL  —  caseLabel 

cLList  —  caseLabeiList 

ciean(case  expr  o/[cLList  eS  end  cL 

cLListl  ’  =  >’  eSl  end ch\]]  end  case 
cleanE(expr) 

[&  (in(expr,  cLList)  *>  clean(eS)) 

{&  (in(expr,  cLListl)  — >  clean(eSl))}] 

St  (in(expr,  cLList)  {|in(expr,  cLListl)}) 

clean  (cose  expr  0/ [cLList  ’■*>’  eS  end  cL 
cLListl  eSl  endcU]] 

otherwise  ’*>’  eS2  end  case 
cleanE(expr) 

[&  (in(expr,  cLList)  *>  clean(eS)) 

{&  (in(expr,  cLListl)  ■>  clean(eSl))}] 

St  (-•  inCexpr,  cLList)  {&  “  in(expr,  cLListl)} 

—  >  clean(eS2)) 

clean ( case  parm  expr  of 

[cLList  eS  end  cL 

cLListl  ’»>’  eSl  end cLl]] 

[’;’  otherwise  eS2]  end  case 

TSub<ItsType(expr),  TYPE<parm>, 


ESub<expr,  parm,  / 

cleanE(expr) 

[&  (in(ItsTag(expr),  cLList)  ->  clean(eS)) 
{&  (in(ItsTag(expr),  cLListl) 

—  >  clean(eSl))}] 

[&  (“’  in(ItsTag(expr),  cLList) 

{&-•  iii(ItsTag(expr),  cLListD) 

—  >  clean(eS2))]  >> 

dean(  )  TRUE  /*  the  empty  statement  */ 
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cieanE:  Expr  ->  Assertion 
anonym  cE  “  clean E 


cE(varRef)  validRef(varRef)  &  initializeiKvarRef) 

cE(literalCons)  TRUE 
cE(consRef)  yalidRef(consRef) 

/*  constant  standard  components  */ 
cE(type  size')  — ^  cleanT(type) 
cE(ref size’)  —  TRUE 

note.  A  variable  need  not  have  a  valid  value  in  order  for  its  "size",  "alignment",  "first",  "last",  and 
"address"  components  to  have  valid  values.  Similarly,  it  makes  sense  to  ask  for  a  variable 
or  constant  reference’s  "size",  "alignment",  "first",  "last",  and  "sizeInBits”  components,  but 
not  "address",  even  if  an  index  is  not  valid.  The  alternative  in  all  these  cases  is: 
cE(ref  ’.  component’) 
validRef(ref) 

cECtype  alignment’)  — cleanT(type) 
cE(ref  ’.  alignment’)  TRUE 
cE(simpleType  ’.  first’)  cleanT(simpleType) 
cE(ref ’.  first’)  ^  TRUE 
cE(simpleType  ’.  last’)  cleanT(simpIeType) 
cE(ref  ’.  last’)  TRUE 
cEC’StorageUnit .  sizeInBits’)  TRUE 
cECvarRef  ’.  sizeInBits’)  TRUE 
cE(varRef  ’.  address’)  validRef(var) 

cE(ref  ’.  itsTag’)  validRef(ref) 
cE(collectionVarId  ’.  nil’)  TRUE 

cE(collectionVarId  ’(’  pointer  ’)  .  refCount’)  cE(pointer) 

note.  There  is  no  rewrite  rule  for  cleanE(pointer),  but  rather  rules  for  cleanE  of  "var", 
"functionDesignator",  and  "functionDesignatorComponent",  which  make  up  "pointer". 

note.  Since  it  is  "expr"  whose  cases  we  are  enumerating,  there  need  be  no  rewrite  rule 
corresponding  to  the  production 
cons  “  ...  I  expr 

/•  function  designators  */ 
cE([containingVar  ’.’]  functionid  [’(’  exprList  ’)’] 

TRUE  [&  validRef(containingVar)] 

[&  cleanConsParmsdcontainingVar  ’.’]  functionid, 
exprList )] 

cE(functionDesignatorComponent) 

cleanFDC(functionDesignatorComponent) 

/•  standard  function  designators  */ 
cE(’Abs  (’  expr’)’) 

cE(expr)  &  well-behaved(’Abs  (’  expr  ’)’) 


cE(’Odd  (’  expr  ’)’)  cE(expr) 

cE(’Chr  (’  expr  ’)’) 

cE(expr)  &  0  <  expr  &  expr  ^  Char. Ord(Char. last) 

noie.  Char. Ord(Char. last)  is  a  fancy  name  for  an  implementation  dependent  manifest  constant, 
which  we  shall  assume  is  in  the  underlying  assertion  language. 

/•  standard  component  function  designators  */ 
cE(simpleType  ’.  Succ  (’  expr  ’)’) 

cleanT(simpleType)  Sc  cE(expr) 

Sc  expr  IN  simpleType  Sc  expr  ^  Last(simpleType) 
cE{ref  ’.  Succ  (’  expr  ’)’) 

cE(expr)  Sc  expr  IN  ItsType(ref) 

Sc  expr  ^  Last(ItsType(reO) 
cE(simpleType  ’.  Pred  (’  expr  ’)’) 

cleanT(simpieType)  Sc  cE(expr) 

Sc  expr  IN  simpleType  Sc  expr  First(simpleType) 
cE(ref  Pred  (’  expr  ’)’) 

cE(expr)  Sc  expr  IN  ItsTyp€(reO 
Sc  expr  ^  First(ItsType(ref)) 

cE(simpleType  ’.  Ord  (’  expr  ’)’) 

— *  cieanT(simpleType)  Sc  cE(expr)  Sc  expr  IN  simpleType 
cE(ref ’.  Ord  (’  expr  ')’) 

cE(expr)  Sc  expr  IN  ItsType(ref) 

syntax:  MM  *  'Max’  1  ’Min’ 

cE(simpleType  ’.’  MM  ’(’  expr  exprl}  ’)’) 

cleanT(simpleType)  Sc  cE(expr)  Sc  expr  IN  simpleType 
{&  cE(exprl)  Sc  exprl  IN  simpleType) 
cE(ref '.’  MM  ’(’  expr  exprl)  ’)’) 
cE(expr)  Sc  expr  IN  ItsType(ref) 

{cE(exprl)  &  exprl  IN  ItsType(ref)) 

cE(collectionVarld  ’.  Index  (’  pointer  ’)’  cE(pointer) 

/•  type  converter  designators  */ 
cE(typeConverterId  ’(’  expr  ’)’)  — "  cE(expr) 

checked  statically,  type  compatibility  of  source  and  target  types 


cE(’(’  expr  ’)’)  cE(expr) 

r  set  constructors  */ 

cE(setTypeld  ’(’  (element  elementl)]  ’)’) 
TRUE  [Sc  cleanSE(setTypeId,element) 
{&  cleanSE(setTypeld,elementl))] 

/•  set  operators  */ 

cE(expr  xor  exprl)  cE(expr)  &  cE(exprl) 
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r  generic  operators  */ 
cE(expr  genericOpr  exprl) 

IF  IntegerType?(ItsType(expr)) 

THEN  cE(expr)  &  cE(exprl) 

&  well-behaved(expr  genericOpr  exprl) 

ELSE  cE(expr)  &  cE(exprl)  FI 

/*  arithmetic  operators  */ 

cE(’-’  expr)  cE(expr)  &  well-behaved  expr) 
cE(expr  div  exprl) 

cE(expr)  &  cE(exprl)  &  exprl  ^  0 
&  well-behaved(expr  div  exprl) 
cE(expr  mod  Qxprl) 

cE(expr)  &  cE(exprl)  &  exprl  ^  0 
&  well-behaved(expr  moJ exprl) 

/*  relational  operators  */ 
cE(expr  comparisonOpr  exprl) 

IF  IiitegerType?(ItsType(expr)) 

THEN  cE(expr)  &cE(exprl) 

&  well-behaved(expr  comparisonOpr  exprl) 

ELSE  cE(expr)  &  cE(exprl)  FI 

checked  statically,  both  operands  are  (subranges  of)  the  same  type 

cECexpr  [a70/]  m  simpleType)  — ^  cE(expr)  &  cleanT(simpleType) 

note.  The  second  operand  of  an  inclusion  comparison  operators  must  be  a  set  type  or  an  index 
type. 

/*  Boolean  operators  */ 
cE(w/expr)  cE(expr) 

cE(expr  exprl)  cE(expr)  &  expr  |cE(exprl)) 

cE(expr  or  exprl)  cE(expr)  &  (expr  1  cE(exprl)) 

cE(expr exprl)  cE(expr)  &  (- expr  lcE(exprl)) 
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initialized:  VarRef  —  >  Assertion 


initialized(varRef)  varRef  IN  ItsType(var) 
validRef:  Ref  —>  Assertion 


yalidRef(varld)  TRUE 

validRef(consId)  —  TRUE 

yalidRef(ref expr  ’)’) 

cleanE(expr)  3c  validRef(ref) 

&  expr  IN  IndexType(ItsType(ref)) 

yalidRef(varRef '(’  pointer  ’)’) 

cleanE(pointer)  &  yalidRef(varRef) 

&  pointer  varRef  nil’ 

noie.  Only  variaoies  can  be  dynamic,  not  constants  or  function  designators. 

yalidRef(ref  fieldid) 

IF  flxedComponent?(ref,  fieldid) 

THEN  yalidRef(ref) 

ELSE  yalidRef(ren  &  initialized(ref  itsTag’) 

8c  yalidVariant(fieldld,  ItsType(ref))  FI 

yaIidRef(recordType  fieldid)  cleanT(recordType) 


yalidVariant:  Id  x  RecordType  ->  Assertion 

synonym  cL  =“  caseLabel 

cLL  *  caseLabelList 

note.  yalidVariant  is  only  applied  to  variant  records. 

yalidVariantCid,  record  [recordDcln]  case  tag 
[default  manifestCons] 
o/cLL  recordDcln  end  cL 

cLLl  ’*>’  recordDclnl  end  cL\] 

[otherwise  ’*  >  ’  recordDcln2]  end  case  endRecord) 

(in(id,  cLL)  —>  IdlnRecordDcln? (id,  recordDcln)) 

{&  in(id,  cLLl)  *>  IdlnRecordDcln ?(id,  recordDclnl))) 
[&  -•  in(id,  cLL)  {&  ”  in(id,  cLLD) 

»>  IdlnRecordDcln? (id,  recordDcln2)] 
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.cleanFDC:  FunctionDesignatorComponent  —  >  Assertion 


cIeanFDC(functionBase  ’(’  expr  ’)’) 

cleanFB(functionBase)  &  cleanE(expr) 

&  expr  IN  IndexType(ItsType(functionBase)) 

cieanFDC(functionBase  fieldid)  cleanFBCfunctionBase) 


cleanFB:  FunctionBase  — >  Assertion 

cleanFB(functionDesignator)  c!eanE(functionDesignator) 

cleanFB(functionDesignatorComponent) 

cleanFDC(functionDesignatorComponent) 


cleanSE:  SetTypeld  x  Element  —  >  Assertion 


cleanSE(setTypeId,  expr) 

cE(expr)  St  expr  IN  BaseType(setTypeld) 

cleanSE(setTypeId,  simpleType) 
cleanT(simpleType) 

&  First(BaseTyp€(setTypeId))  <  First(simpleType) 
&  Last(simpleType)  <  Last(BaseType(setTypeId)) 


well-behaved:  Expr  —  >  Assertion 

synonym  wb  =*  well-behaved, 

eRO  =■  extendedRangeOperand, 

SI  =“  Signedint, 

UI  —  Unsignedint 

note",  well-behaved  applies  only  to  arithmetic  operations  and  comparisons  of  arithmetic 

expressions,  and  specifies  when  they  are  well-behaved,  as  defined  in  the  Euclid  Report 


wb(expr  addingOpr  expr  I ) 

IF  eRO(expr)  |eRO(exprl) 

THEN  expr  addingOpr  exprl  IN  UI 
ELSE  expr  IN  SI  &  exprl  IN  SI 

&  expr  addingOpr  exprl  IN  SI  FI 

wb(expr  multiplyingOpr  exprl) 

expr  IN  SI  k  exprl  IN  SI  k  expr  multiplyingOpr  exprl  IN  SI 

wb(’-’  expr)  expr  IN  SI  k  expr  IN  SI 

wb(’Abs  (’  expr  ’)')  expr  IN  SI  k  'Abs  (’  expr  ’)’  IN  SI 

wb^expr  [no(\  ’=■’  exprl)  expr  IN  SI  k  exprl  IN  SI 

wb(expr  inequalityOpr  exprl) 

IF  eRO(expr) 

THEN  0  ^  exprl 
ELSE  IF  eRO(exprl) 

THEN  0  <  expr 

ELSE  expr  IN  SI  k  exprl  IN  SI  FI  FI 


extendedRangeOperand:  Expr  — >  Assertion 

synonym  eRO  =■  extendedRangeOperand, 

SI  —  Signedint, 

UI  *■  Unsignedint 

note.  extendedRangeOperand  applies  only  to  arithmetic  expressions,  and  specifies  when  an 
arithmetic  expression  is  an  extended  range  operand,  as  given  in  the  Report 


eRO(ref)  UI&notSI(ItsType(ref)) 

note.  A  constant  reference  is  always  of  a  subrange  type,  and  not  of  "Integer"  type,  so  it  is  legal  to 
invoke  Ul&notSI. 


eRO(literalCons) 
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Last(SI)  <  literalCons  &  HteralCons  ^  Last(UI) 
eRO(consStandardComponent)  FALSE 

eRO(functionValue)  UI&notSI(ItsType(functionValue)) 
eRO(’(’ expr ’)’)  ^  eRO(expr) 
eRO(’-’expr)  FALSE 

eRO(expr  addingOpr  exprl)  “*■  eRO(expr)  |  eRO(exprl) 
eRO(expr  muitiplyingOpr  exprl)  — *  FALSE 


Ul&notSl:  slmpleType  — >  Assertion 

note.  Ul&notSI  is  a  subsidiary  function  invoked  by  extendedRangeOperand  and  applies  only  to 
arithmetic  subrange  types  other  than  "Integer". 

UI&notSKsimpleType) 

0  ^  First(simpIeType)  &  Last(simpleType)  ^  Last(Unsignedlnt) 

&  Last(Signedlnt)  <  Last(simpleType) 
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ItsType:  Expr  —  >  Type 


r  variable  and  constant  refs  plus 

function  designators  and  function  designator 
•  components  */ 

syntax:  vRcRfB  “  varRef  1  consRef  1  functionBase 
ItsType(varld)  TYPE<varId> 

ItsType(consId)  TYPE<consId> 

ItsType(function  [’(’ exprList ’)’))  TYPE< function> 

ItsTypefvRcRfB  ’(’  expr  ’)’)  “*  ComponentType(ItsTyp€(vRcRfB)) 
ItsType(vRcRfB  ’(’  pointer  ’)’)  ObjectType(ItsType(vRcRfB)) 

ItsType(vRcRfB  fieldid) 

TypeOfRecordComponent(fieldId,  ItsTypeivRcRfB)) 
ItsType(recordType  fieidid) 

TypeOfRecordComponent(fieldId,  recordType) 

!tsType(unsignedNumber)  ’Integer’ 

ItsT)rp€(literaiString)  ’String  (’  Length(literalString)  ’)’ 

ItsType(literalChar)  “*  ’Char’ 

ItsType(enumeratedValueld)  TYPE<enumeratedValueId> 

ItsType(type  ’.  size’)  ’Integer’ 

ItsType(ref  ’.  size’)  ’Integer’ 

ItsType(type  ’.  alignment’)  ’Integer’ 

ItsTypefref ’.  alignment)  ’Integer’ 

ItsType(simpleType  ’.  first’)  simpleType 

ItsType(ref  ’.  first’)  ItsType(ref) 

ItsType(simpleType  ’.  last’)  simpleType 

ItsType(ref  ’.  last’)  ItsType(ref) 

ItsType(’StorageUnit .  sizeInBits’)  ’Integer’ 

ItsType(varRef ’.  sizelnBits’)  ’Integer’ 

ItsType( varRef  ’.  address’)  *  ’Integer’ 

ItsType(ref ’.  itsTag’)  TagType(ItsType(ref)) 

ItsTypefcollectionVarld  ’.  nil’)  ’*’  collectionVarld 

ItsTyp€(collectionVarId  ’(’  pointer  ’)  .  refCount’)  ’Integer’ 

ItsType(’Abs  (’  expr  ’)’)  ’Integer’ 

ItsType(’Odd  (’  expr  ’)’)  ’Boolean’ 

ItsType(’Chr  (’  expr  ’)’)  ’Char’ 

syntax:  SPMM  “  ’Succ’  1  ’Pred’  |  ’Max’  1  ’Min’ 

ItsTypefsimpleType  ’.’  SPMM  ’(’  exprList  ’)’)  simpleType 

ItsType(ref  ’.’  SPMM  ’(’  exprList  ’)’)  ItsType(reO 

ItsTypefsimpleType  ’.  Ord  (’  expr  ’)’)  ’Integer’ 

ItsType(ref  ’.  Ord  ('  expr  ’)’)  ^  ’Integer’ 

ItsType(collectionVarId  ’.  Index  (’  pointer  ’)’)  Integer 

ItsType(typeConverterId  ’(’  expr  ’)’)  TYPE<typeConverterld> 
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ItsType(’(’  expr  ’)')  ItsType(expr) 

ItsType(setTypeId  ’(’  [element  elementl}]  ’)’) 

/*  set  operations  */ 

ItsType(expr  ATor  expr  1)  ItsType{expr) 

r  generic  operations  */ 

ItsType(expr  genericOpr  exprl) 

IF  IntegerType?(ItsType(expr)) 

THEN  ’Integer’ 

'  ELSE  ItsType(expr)  FI 

/*  arithmetic  operations  * ! 

ItsType(’-’  expr)  ’Integer’ 

ItsType(expr  J/v  exprl)  ’Integer’ 

ItsType(expr  moc/ exprl)  ’Integer’ 

r  relational  operations  */ 

ItsType(expr  comparisonOpr  exprl)  ’Boolean’ 

ItsType(expr  [no^  in  simpleType)  ’Boolean’ 

/*  logical  operations  */ 

ItsType(/70/ expr)  ’Boolean’ 

ItsType(expr  binaryBooleanOpr  exprl)  ’Boolean' 


setTypeld 
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